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Chapter 1 


Motivation, Motivation, Motivation 


In this book we will develop a theory, which is deeply rooted in experiment 
and can only be understood using a new mathematical language. It not only 
describes how nature works in all microscopic physical systems, but also in 
macroscopic physical systems. 

It is most important, however, when the characteristic length scale of the phys¬ 
ical system is smaller than 10 -4 m. 

This theory is called quantum mechanics. 

The physical world we will discover in our studies is a strange and fascinating 
place where nature operates in a way that seems to defy the intuition we have 
built up living among macroscopic systems. We will not endeavor to explain 
why nature works in this particular way since it is my strong belief, as we will 
see in this book, that it is not possible to do so within the context of this theory. 
We will, however, be able to correctly predict the outcomes of an amazingly wide 
range of experiments in many different fields of physics, chemistry and biology. 

Let me emphasize my strong belief that theories in physics should only endeavor 
to make predictions about experimental measurements and not attempt to pro¬ 
vide reasons for why nature works in these particular ways, that is, why we must 
choose to start with certain postulates. 

Feynman put it best. 

We know how the electrons and light behave. But what can I call it? 

If I say they behave like particles I give the wrong impression; also 
if I say they behave like waves. They behave in their own inimitable 
way, which technically could be called a quantum mechanical way. 

They behave in a way that is like nothing that you have seen before. 

Your experience with things that you have seen before is incomplete. 


1 





The behavior of things on a very small scale is simply different. An 
atom does not behave like a miniature representation of the solar 
system with little planets going around in orbits. Nor does it ap¬ 
pear like a cloud or fog of some sort surrounding the nucleus. It 
behaves like nothing you have ever seen before. 

There is one simplification at least. Electrons behave in this respect 
exactly the same way as photons; they are both screwy, but in ex¬ 
actly the same way. 

The difficulty really is psychological and exists in the perpetual tor¬ 
ment that results from your saying to yourself “but how can it really 
be like that?”, which is a reflection of an uncontrolled but vain de¬ 
sire to see it in terms of something familiar. I will not describe it in 
terms of an analogy with something familiar; I will simply describe 
it... 

I am going to tell you what nature behaves like. If you will simply 
admit that maybe she does behave like this, you will find her a de¬ 
lightful and entrancing thing. Do not keep saying to yourself, if you 
can avoid it, “but how can it really be like that?” because you will 
get “down the drain”, into a blind alley from which nobody has yet 
escaped. 

Nobody knows how it can be like that. 

and we can add an addendum . 

and “nobody knows why it is like that” 

We will not be able to reduce the quantum universe to everyday ways of think- 
ing(usually called common sense). In fact, in order to understand the ideas and 
implications of the theory we will have to adjust all of our ways of thinking at 
the most fundamental level. 

Imagine, for a moment, that you are attempting to understand a new culture. 
If you are serious about it, the first thing you would do is to learn the language 
appropriate to that culture so that you can put your experiences in the proper 
context. Understanding the universe of quantum phenomena is much like a un¬ 
derstanding a new culture where the appropriate language is mathematics and 
the experiences we are attempting to put into context are experiments. 

As we shall see, we will have to use a mathematical language to describe the 
quantum world since ordinary language, which was developed to explain every¬ 
day occurrences(experiments on macroscopic objects), will turn out to be totally 
inadequate. 

Since it makes no sense to attempt any understanding of the nature of quantum 
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phenomena without first learning to speak and use the language of the quantum 
world, we will spend the first several chapters of this book learning the appro¬ 
priate mathematics, in particular, the subject of linear vector spaces. 

The adjustment of our ways of thinking at the fundamental level that will be 
needed is not simply a mathematical matter, however. The development of the 
necessary mathematical language will not come into conflict with our everyday 
modes of thinking in any major way. Although, the mathematics of linear vector 
spaces is very elegant, you will be able to understand it without much difficulty 
and without having your basic view of the world changed at any fundamental 
level. 

You will be troubled, however, when we apply the mathematics to physical sys¬ 
tems that develop according to quantum ideas. We will attach physical mean¬ 
ing to the mathematical formalism in ways that will conflict with your well- 
developed views(I will call these classical views) about how the world works. 

After studying wave mechanics, we will rethink the mathematics and the quan¬ 
tum theory using the Dirac language, which, as we shall see, incorporates the 
very nature of the quantum world in an intrinsic and natural way. Since we 
are attempting to develop a physical theory, we will link all the mathematical 
concepts that we introduce to physical concepts as we proceed. 

Dirac was able to link the physical structure of quantum mechanics with the 
mathematical structure in a unique way. His mathematical language incorpo¬ 
rates the physical meaning directly into the formalism and the calculational 
methods. The language explicitly exhibits the physics and clearly exposes the 
internal logic of quantum mechanics. Once we understand the language, every 
equation will directly convey its physical meaning without the need for further 
explanation or any need for inadequate models. 

It is very important to understand that the Dirac language is not simply a new 
notation for quantum mechanics (as many physicists seem to think). It is a way 
of thinking about quantum mechanics. It will allow us to use the physical ideas 
of quantum mechanics to develop the appropriate mathematical language rather 
than the other way around. This will allow the very mathematical quantum the¬ 
ory to be more closely connected to experiment than any other physical theory. 

These statements about the importance of understanding the mathematical lan¬ 
guage appropriate to the physics under consideration do not only apply to the 
quantum world. It is true for all areas of physics and other sciences. One should 
always learn the appropriate language before studying any field that relies on 
that language for its understanding. 

The first part of this book will cover various aspects of the original formulations 
of quantum mechanics. We will concentrate on the development of the theory 
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in terms of the position and the momentum, namely, wave mechanics. 


1.1. Basic Principles and Concepts of Quantum The¬ 
ory 

There are many equivalent formulations of quantum mechanics, namely, 

1. Schrodinger wave mechanics 

2. Heisenberg matrix mechanics 

Dirac developed a general formalism for the quantum theory, which includes 
wave mechanics, matrix mechanics, and other formulations as special cases. We 
shall first develop Schrodinger wave mechanics and then generalize it to the 
Dirac formalism, which is very abstract, in later chapters of this book. 

In these notes, the quantum formalism will be applied mainly to non-relativistic 
systems (v « c = 3 x 10 10 cm/s). 

1.1.1. Mathematical Methods 

We will have to use various mathematical techniques in order to formulate quan¬ 
tum mechanics and in order to solve the resulting equations. Some of these tech¬ 
niques you have learned in a Mathematical Methods in Physics course, some you 
have learned in Mathematics courses such as Linear Algebra, and Multivariable 
Calculus and some we will develop in this book. 

The techniques we will need are linear operators in Hilbert space (an extension of 
linear algebra), partial differential equations, special functions of mathematical 
physics (Legendre polynomials, spherical harmonics, Bessel functions), Fourier 
transforms, use of Green’s functions, integral equations, contour integration in 
the complex plane, group theory and group representations, etc. We will cover 
many of these techniques in detail as we proceed. 

As we proceed we will cover many applications of quantum theory to physical 
systems in the areas of atomic and molecular physics. 

We will use the following two definitions: 

1. macroscopic phenomena - observable with the naked eye or with an ordi¬ 
nary microscope; length scale > 1CT 4 cm( 1 micron). 

2. microscopic phenomena - atomic and subatomic; length scale < 10 -8 cm = 
0.1 nm. 
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1.1.2. Classical Concepts 

We cannot throw out all of classical physics for it works well when dealing 
with macroscopic phenomena. Furthermore, quantum theory relies on various 
formulations of classical physics, namely, 

1. Classical mechanics (action principle, Lagrangian formulation, Hamilto¬ 
nian formulation) 

2. Classical electromagnetism (Maxwell’s equations, Lorentz force law 

3. Thermodynamics and Statistical Mechanics 

4. Special Relativity (important when speed comparable to c) 

Classical physics accurately describes macroscopic phenomena. 

1. Classical mechanics: 


p = linear momentum , p = mv , non-relativistic 

F= d ± 

dt 

_ m v ... 

p = — t , relativistically correct 

■jc ¥ 


in) 


2. Types of forces and how they are produced: 

(a) Electromagnetism (Maxwell’s equations, Lorentz force law) 

(b) Gravitation (Newton’s law of gravitation for weak gravitational fields, 
Einstein’s general theory of relativity) 

3. Thermodynamics and Statistical Mechanics - describes average properties 
of systems containing many particles. 

There are no logical inconsistencies in classical physics. However, it turns out, 
as we will see, that microscopic phenomena do not obey the laws of classical 
physics. 


Quantum physics is the theory describing microscopic phenomena and macro¬ 
scopic phenomena. It approximates the laws of classical physics for non-quantum 
macroscopic phenomena. 

1. Quantum mechanics - gives the equations of motion for a system (new 
non-classical concepts are needed). 

2. Types of interactions among microscopic particles: 

(a) range of interaction extends to macroscopic distances 
i. electromagnetism 
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ii. gravitation 

(b) interactions that are negligible at macroscopic distances or they act 
only when particles are microscopically separated 

i. strong interactions (holds the protons and neutrons of a nucleus 
together) 

ii. weak interactions (responsible for the beta-decay of nuclei) 

These are the only interactions known today. In the 1970s electromag¬ 
netism and weak interactions were unified into the electroweak interac¬ 
tions. In the 1980s, the electroweak interaction and the strong interaction 
were unified as part of the standard model. 

3. Quantum Statistical Mechanics - approaches classical statistical mechanics 
at sufficiently high temperatures. 

1.1.3. The Fundamental Principle of Quantum Mechanics 

To introduce this principle, let us consider the experimentally observed alpha- 
decay of certain nuclei. We crudely describe an atom in this way: 

1 . radius of volume occupied by orbiting electrons = 10~ 8 cm 

2. radius of volume occupied by nucleus = 10~ 13 cm 
where lA= 10 -8 cto and 1 fermi = 10~ 13 cm. 

As we shall see, we have no idea what is meant by the words orbiting in this 
context. Such occurrences will be in the emphasis font. 

The nucleus contains nucleons (protons and neutrons) bound together by the 
strong interaction, which overcomes the electrostatic repulsion of the protons 
(the gravitational attraction is negligible). 

We also have the following data and definitions: 

electron charge = -1.60 x 10 ~ 19 coul = -4.80 x 10 ~ 10 esu 

proton charge = -(electron charge) 

neutron charge = 0 

electron mass = 9.11 x 10 -28 gm 

proton mass « neutron mass = 1.67 x 10~ 24 gm 

(the neutron is slightly more massive than the proton) 

Z = atomic number of the nucleus = number of protons 
= number of electrons in a neutral atom 
A = mass number of the nucleus = number of nucleons 

Notation: 

2 (chemical symbol) A 92 /7 232 = uranium 232 90 Th 228 = thorium228 
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An a-particle is a helium-4 nucleus or 2 He 4 . Certain unstable nuclei sponta¬ 
neously emit an a-particle (a-decay) 

92 U 232 go Th 228 + a or A = 232 = 228 + 4 and Z = 92 = 90 + 2 (1.2) 

or we have the setup(classical model) shown in Figure 1.1 below. 

before decay after decay 

O —O o—► 

at rest or zero total linear momentum 

linear momentum must remain zero 


Figure 1.1: Decay Picture 1 


Experimental Observation: Let Nq (very large number ~ 10 23 ) identical 
jj 235 nuc j e j j-jg observed at t = 0. Then, the a-decays of these nuclei will not 
occur at the same time. Furthermore, the alpha particles emitted in the decays 
will not go in the same direction as in Figure 1.2 below. 


■* 




decay at tj 


decay at t. 


Figure 1.2: Decay Picture 2 

Let N(t) be the number of JJ 235 nuclei (not yet decayed) present at time t. It 
is observed that 

N(t) = Noe~' rt , 7 = const ant (dimensions 1 /time) (1.3) 

as shown in Figure 1.3 below. 
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N(t) 



t 1/2 = 70 years for U 235 


Figure 1.3: Decay Data 


Therefore, 


dN(t) 

dt 

i-dN) 
N 


= -'yNoe 7 * = -7JV(f) 


= 'ydt 

# decays in dt 
$ present at time t 

= probability of a decay between t and t + dt 


(1.4) 

(1.5) 


Therefore, 7 = probability of decay per unit time. Note that 




Nn 

= — = N 0 e ~^ 2 
2 

=> * = e -7Tl/2 =► n /2 = 


for(2) 

7 


( 1 . 6 ) 

(1.7) 


Thus, systems in the same initial state will, in general, develop differently in 
time, that is, U 235 nuclei live for different lengths of time. Since the initial 
states are identical (there does not exist any way to distinguish the initial 
states), it is impossible to predict when a given nucleus will decay. 


However, there does exist a definite probability of a decay per unit time. This 
means that we should be able to predict how many decays will occur in a given 
time interval when we observe a sample of many [ 7 235 nuclei. 


This decay process may be viewed in another way. Before the decay, the 








a-particle is located in the t/ 235 nucleus. After the decay, the a-particle is 
separated from the residual nucleus and is traveling in a certain direction. It is 
impossible to predict the a-particle’s position (inside U 235 or separated from 
Th 22S as a function of time, and it is impossible to predict the a-particle’s ve¬ 
locity direction as a function of time. One may only hope to find the probability 
that the a-particle will be at a given position at a given time and the probability 
that the a-particle will be traveling at a given velocity at a given time. 

This leads us to the Basic Principle of Quantum Mechanics: 

One can only predict the probability of finding a particle at a given 
position at a given time and the probability of finding a particle with 
given momentum (momentum is easier to deal with than velocity) 
at a given time. The result of measuring a given particle’s position 
or momentum at a certain time cannot be predicted in general - 
only the results of measuring the position or momentum of many 
identically-prepared particles at a certain time can be predicted. 

Note that this is contrary to the Classical Doctrine, which states that a particle’s 
position and momentum at any time is completely determined by the particle’s 
initial state (specified by the initial position and momentum). In quantum me¬ 
chanics, a complete specification of the particle’s initial state (we will discuss 
later exactly what must be given for a complete quantum mechanical specifica¬ 
tion of a state) does not determine a particle’s position and momentum at all 
later times - only the probability that the particle will have a certain position 
and momentum can be predicted for observations made at later times. 

Note (1): You might argue that if the f/ 235 nuclei (which we have stated to 
be in identical initial states so that there does not exist any way to distin¬ 
guish the states) decay at different times, then there must be some difference 
in the initial states. You might argue that there exists some hidden variable 
(which we have not as yet succeeded in determining) which has different values 
in the different nuclei and which determine when a given nucleus will decay. 
This certainly a possibility. However, no one has ever found such a variable so 
that the time at which the decay will occur can be predicted with certainty. 
In these notes, we will take the point of view (standard quantum theory) that 
such hidden variables do not exist. In fact, there now exists much experimental 
evidence that hidden variables cannot exist (more about this later). 

Note (2): How does classical physics fit into this description? How is classical 
determinism (observed in macroscopic phenomena) compatible with this prob¬ 
abilistic interpretation of nature? This is easy to deal with. For macroscopic 
objects, the probability of observing the classical trajectory (position and momen¬ 
tum as a function of time) to an accuracy of better than & 10~ 4 cm is almost 
unity (negligible uncertainty). This is known as the Correspondence Principle, 
which implies that quantum physics approaches classical physics for macroscopic 
objects (more about this later). 
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The central problem in quantum mechanics is therefore the following: 

Given a particle with known interactions: in terms of the initial 
state of the particle, find the probability of observing the particle 
at a given position as a function of time and find the probability of 
observing the particle with a given momentum as a function of time 
Before giving rules for determining these probabilities for physical 
systems, let us review some simple probability concepts. 


1.2. Simple Ideas about Probability 

1.2.1. Discrete Distributions 

Let a variable take on certain discrete values iq, 112,113 ,.... Suppose N measure¬ 
ments of the variable are made. Let Nj = of measurements in which the result 
u t was obtained. 


Definition 

The probability of observing u t is 


P(Ui) 


Ni 

N 


( 1 . 8 ) 


in the limit as N -*• 00 (TV; also -*• 00 unless wp(ui) = 0). We must also have 

p(ui) > 0 . 


The probability of observing U/~ or u^+i or . or u/-+e = Y,i=k p( u i)- The 

probability of observing some value (from the set of all possible values) is given 

by 


E p<w) = E 

all i all i 


Ni 

N 


-Y Ni 

N Mi 



(1.9) 


which is called the normalization condition. 


1.2.2. Continuous Distributions 

Let the variable u be capable of taking on any value in the interval [a, b]. Sup¬ 
pose N measurements of the variable are made. Let dN(u ) = # of measure¬ 
ments in which the variable was in the interval [u,u + du ]. dN(u) is not meant 
to represent the differential of some function N(u). 
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Definition 

The probability of observing the variable in the interval [it, u + dn] is given by 


lim p(u)du 
N-> oo 


dN(u) 

N 


= Probability = babi|it density 
unit interval of it 

The probability of measuring it in the interval [ 111 , 112 ] is 


( 1 . 10 ) 

( 1 . 11 ) 


u 2 

prob ([ 111 , 112 ]) = J~ p(u)dh 


( 1 . 12 ) 


Since all measurements yield values in [a, b], we have the normalization condi¬ 
tion 

b 

J~ p(u)du=l (1.13) 

a 


Example A 

It is equally probable to measure it anywhere in the interval [0,a]. Therefore, 
p(u)du = Adu for u 6 [0,a] , A- constant 


But 


a a 

J ° p(u)du=l = A J du = Aa- 


A- 1 - 

a 


■ p(u) = - for u € [0, a] 
a 


m 

i 

a 

0 a u 

Figure 1.4: Example A - Probability Distribution 


Example B 

Consider the Gaussian distribution in the interval (it € [- 00 , 00 ]) given by 

p(u) = Ae-( u - Uo)2l2a2 (1.14) 

where ito,cr, A are constants and a > 0. This looks like Figure 1.5 below. 


11 









in u values 


Figure 1.5: Example B - Probability Distribution 


We then have 


If we let 


we get 


oo oo 


J~ p(u)du = 1 = A J~ e ( u u °' > l 2a du 

— oo —oo 

(1.15) 

u — uq . du 

v = ——- > dv = 

(1.16) 

v2 a V2(j 

oo 

1 = Aa\/2 J e~ v dv 

(1.17) 


— oo 


Trick for doing the integral: 


J~ e v dv\ = Je x dxJ~e v dy = J~dxJ dye ^ x +y ^ 

^-00 / — oo — oo —oo —oo 

oo oo 

= 2irre~ r dr = -i r d(e~ r ) = n 

o o 

oo 

=> J~ e~ v dv = \/tv 


(1.18) 


where we have done the integral by integrating over the entire x-y plane using 
circular rings r 2 = x 2 + y 2 as shown in Figure 1.6 below: 


Therefore, 

1 = Aa\Z2n -»■ A = _ (1.19) 

y/2na 

p(u) = _^ e -(“-v°) 2 /2- 2 (i.20) 

v 2na 


12 












The probability of measuring it between a and b is 

b 



area under Gaussian curve between a and b 


a 

This can be evaluated numerically. 

We now return to considerations of general distributions p(u). 


( 1 . 21 ) 


Definition 

The average value (or expectation value) of it = (u): 


Discrete 

(«) = W Z U i N i = Z u iP( u i) 

all i all i 


Continuous 


(u) = — J udN(u) = up{u)dh 

all u all u 


In general, the average value (expectation value) of some function of u, f(u), is 
given by 


Discrete Continuous 

(/(«))= Z f{ui)p(ui) (/(«))= J f(u)p(u)du 

all i 11 

all u 

The values measured for u will, of course, not all be equal to (it). One would 
like a measure of the spread in observed u values around (it). Such a measure 
is the root-mean-square (rms) or standard deviation given by 

Ait = \/{(u- (u }) 2 } 

^ (Ait) 2 = (it 2 ) - 2 ((it (it)) + (<zt) 2 ) - (w 2 ) - (u) 2 
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Clearly, if A u = 0, then the only value measurements of u will yield is u = (u) 
since we have 


(A u ) 2 = ((u- (u)) 2 } = £p( w *) ( u i -( u )) 2 = 0 

i 


which, because every term in the sum is positive, says that u,; = (u) for all i. 


Example A 

Equally probable anywhere in [0,a]. 

1 


p(u) = - for u e [0, a] , p(u) = 0 elsewhere 
a 


(u) = f u—du = - 

' ' J a 2 


U, 

(u 2 ) = y~ u 2 —du= — 


1 , a 2 
a 3 


A u = yj(u 2 ) - (u) 


2 a 


VY2. 


Example B 

Gaussian distribution. 


(u)= f u—^e-( u - Uo) 2 l 2 a 2 du= f (v + u 0 )—^e- v2/2a2 dv 
J \Z2na J y/2wo 


oo oo 

f v _ e~ v ! 2cr dv + uo f _ e~ v ^ 2a dv = 0 + mo = Mo 


\J 7 2nto J \phto 

— OO 

1 


(Am) 2 = f (u - u 0 ) 2 —^e- ( “" Uo)2/2<72 dM = f J-^e^'^dv 
J \/2t ut J 


\/2no 


oo 

/ 


2cr / 2 -w* 1 

we aw 
h r 


Another trick: 




2 , \/7T 


I = / e dw = —— / e u du- 


n/A 


n/A 


dl 1 \/7r r d 

dX = ~2 A3/2 = J dX 


f 4ve~ Xw2 dw = - f w 2 e- Xw 'dw 

J d\ J 


oo 

/ 


2 -\w 2 i 1 n/tT 
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Putting A = 1 we get 

oo — 2 00 

r 2 -w 2 7 v /a \2 2(7 f 2 -tt ; 2 7 2 

we aw - -=> (Aw) = —= we aw = a 

J 2 \/ 7T J 

— 00 —00 

These probability distributions can be generalized to several variables. 

In quantum mechanics (where we will focus on one particle systems) we want 
to find expressions for the following: 

1. p(x,y, z-,t)d 3 x = p{x,y,z\t)dxdydz = probability of observing particle 
with coordinates in ( x , x + dx), (y, y + dy), (z, z + dz) at time t 

2. p(p x ,p y ,p z -,t)d 3 p = p(px,py,p z -,t)dp x dp y dp z = probability of observing 
particle with momentum in (p x , p x + dp x ),(p y ,p y + dp y ), (p z ,p z + dp z ) at 
time t 

Finding these expressions is the central problem of quantum (wave) mechanics. 

Additional note 

We use continuous distributions for definiteness. The n th (n = 0,1,2,3,...) 
moment of the probability distribution p( u) is defined to be 

(u n ) = duu n p(u ) (1.22) 


Theorem (without proof) 

If two distributions pi(w) and p 2 (w) have the same moments for all n, then 
Pi(u) = P2 (u) , that is, the moments uniquely determine the distribution. 

Experimental evidence of the so-called Wave-Particle Duality suggests how to 
calculate these probabilities. This duality refers to the observation that what 
we usually regard as a wave phenomenon (for example, the propagation of light) 
sometimes exhibits wave properties and sometimes exhibits particle properties 
while what we usually regard as a particle(for example, an electron) sometimes 
exhibits particle properties and sometimes exhibits wave properties!!! 

1.2.3. Review of Waves and Diffraction 

Let x = ( x,y,z ), |S| = r = \Jx 2 + y 1 + z 2 locate a point in space with respect to 
some origin (its position vector) (see Figure 1.7 below). Let h = a unit vector 
(|h| = 1 pointing in a fixed direction, h is dimensionless (it has no units). Let 
p{x, t ) be some physical quantity defined at each point x at some time t. We 
refer to ?y(i,t) as a disturbance. In this discussion, we will generalize slightly 
and let r/(x, t) take on complex values. For example, r/(x, t) can refer to one 
of the Cartesian components of the electric field E(x,t ) or the magnetic field 
B(x,t). 
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(i ii) plane surface of constant rj 
Figure 1.7: Plane Wave Relationships 


Definition 

r/(x, t) is called a plane wave if it can be written in the form 


r/(x, t ) = F(x-h - vt) 


(1.23) 


where F is some arbitrary function of one variable, where n is some fixed unit 
vector and v = a real constant, v > 0. Clearly, v has units 


x ■ n 
t 


units of velocity 


(1.24) 


Consider the points in space for which x ■ h — vt = a constant (77 has the same 
value at these points). Therefore 


x - h- constant + vt = projection of x onto the h direction - see figure (1.25) 


This plane of constant 77 moves with constant speed v. All points on the plane 
perpendicular to fi have the same value of x ■ h. These relationships are shown 
in Figure 1.7 above. 


Thus, the surfaces on which 77 (x,t) has a fixed value are plane surfaces perpen¬ 
dicular to n which move at speed v in the n direction (hence the name plane 
wave). 


v = phase velocity of the plane wave, i.e., the velocity 
of the planes of constant phase(i • h) 
n = direction of propagation 


That is the correct and proper definition of a wave!! 
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Now 


Similarly, 


dil ^/dix-h-vt) , 

— at—-~ vF 


dii .d(x-h-vt) ,d(xn x + yn v + zn z -vt) . 

d~x =F Ox =F dx =HxF 


d 2 r] „d(x-h-vt) 

dx 2 x dx 


= nlF" 


and 


9 2 V _ 2 rati d 2 ll _ 2 „ 

Qy2 n v ’ dz 2 n z 


Therefore, 


d 2 y d 2 ii d 2 rj 
dx 2 dy 2 dz 2 


(nl 


+ n y + n' 


■)F' 


Id 2 d 2 d 2 1 3 2 \ / 
\9x 2 dy 2 dz 2 v 2 dt 2 / \ 


(n • n)F" = F" 

v 2 dt 2 j 11 


1 d 2 y 
v 2 dt 2 


which is the classical wave equation. It is obeyed by any plane wave of the form 
y{x, t) = F(x ■ n — vt) where v = phase velocity of the plane wave. 


Note that: 

1. The classical wave equation is linear, that is, if r]i(x,t) and ij 2 (x,t) are 
separately solutions of the wave equation, then any linear superposition 
(combination) 

a!rii(x,t) + a 2 ri2(x,t) (1.26) 

where ai and a 2 are constants, is also a solution. 

2. If y(x,t) is a complex solution of the wave equation, then Real(y(x,t )) 
and Imag(ii(x,t)) separately satisfy the wave equation (because 1/v 2 is 
real). 

3. The classical wave equation has solutions other than plane wave solutions, 
for example, the linear superposition of 2 plane waves traveling in different 
directions 

r](x, t ) = F\(x ■ hi - vt) + F 2 (x ■ h 2 - vt) (1-27) 

is a solution. 


Definition 


rj(x , t) is a spherical wave if it can be written in the form 


V(x,t) 


f ^G(r-vt) -*• spherical outgoing wave 
^ G(r + vt) spherical incoming wave 


(1.28) 
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where, r = |x|, G is some arbitrary function of one variable and where v is a real 
constant (v > 0) with the dimensions of velocity. 

Now consider the points in space for which r T vt = constant (G has the same 
value at these points). We consider r—vt and r+vt as 2 separate cases. Therefore, 
r = constant ± vt -*■ a sphere (centered at r = 0 that moves (outward or inward) 
with speed v. Thus the constant phase surfaces on which ( r/(x,t )) has a fixed 
value are spheres (with center at the origin) which move (outward or inward) 
with speed v. v = phase velocity of the spherical wave. The 1/r factor was used 
in the definition of rj(x, t) so that ri(x, t) obeys the classical wave equation for 
r * 0. 


We prove this as follows. First consider a function f(r). We need to calculate 
V 2 /(r) where r = |x| = \Jx 2 + y 2 + z 2 . We have 

dr x dr y dr z 

dx r ’ dy r ’ dz r 

df df dr df x 

dx dr dx dr r 

dj[ = <tf_dr_ = dj^y 

dy dr dy dr r 

df _ dj^dr _ ctfz 
dz dr dz dr r 

and 


dx 2 


+ /l) + d_ x 

dx \ dr r ) dr r dr dx\r ) dx\dr ) r 


df 1 df d 

-+ X -:- 1 - 

dr r dr dr 


(;) 


dr : 


dr \ 


dr x d I df\ 
dx r dr \ dr ) 



dr 

dx 

x 

r 


and similarly for d 2 f/dy 2 and d 2 f/dz 2 . We then obtain 


But 


Therefore, 


2 r df f x 2 y 2 z 2 

V f = —- — + — + — 

dr 2 r 2 r 2 r 2 


Hi 


3 x 


2 2 ' 
y z 


<f 2 l + 2df_ 

dr 2 r dr 


r dr 2 




(1.29) 

£).i| 

{r—TT + 2—\ 

(1.30) 

dr / r ' 

\ dr 2 dr) 

( r f( r )) 


(1.31) 
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for r * 0 since things are not defined at r = 0. Now for 


we have 


r](x,t ) = -G(r t vt) 
r 


dv L,, d 2 il \ 2 z l n" 

— = -G = (tw) =u -G 

ot r at z r r 

V 2 ?7= ~- 7 ^(rv(r)) = (G(rTnt)) = -G" 
r ar z r ar z r 

2 1 <9 2? 7 r. 

V "-^^ =0 


(1.32) 


(1.33) 


so that spherical waves also obey the classical wave equation (for r t 0). The 
1 /r factor is required physically of course to account for the 1/r 2 decrease in the 
intensity of the spherical wave with distance. 


Definition 

A function ifix, t ) has a definite frequency if it is of the form 

= f+(x)e iut + f-(x)e~ iut (1.34) 

where f±{x) are arbitrary functions of x and where w is a real number such that 
ix > 0. w = an angular frequency. Now 

e ±lwt = coswt ± isinwt (1.35) 

so that this i)(x,t) is a linear superposition of cos tot and sinwf. 


In addition, since e ±27rz = +1, at any point x, this function ifix,t) repeats itself 
in time after a time interval T (period of rj(x,t)), where ixT = 2n or 


2tt 
T= — 

UJ 


independent of x. The frequency of ?fix, t ) is then 


/= - = — 
J T 2tt 


(1.36) 


(1.37) 


Plane Wave of Definite Frequency 

We have 

ifixfi) = f+(x)e lult + f-(x)e~ lu,t = F(x ■ n - vt) 
Therefore, we can write 


f ± (x)e ±iut = f ± (x)e ±i! ^ vt = A ± e ±i " (vt -™'> = 


(1.38) 
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(1.39) 



where A± are constants so that it fits the standard functional form. If we let 
10 /v = k and k = kh = propagation vector so that k is parallel to n = direction of 
wave propagation, then 

r](x, t ) = A + e +i(ut -^ + (1.40) 

Now let n = z (propagation in the 2 -direction) for definiteness. Therefore, 

r](x, t ) = A + e^ ut - kz) + A^e~ i{uit - kz) (1.41) 

At a given time, ??(£,f) repeats itself in 2 after a distance A (wavelength of 
r](x,t)) where kX = 2tt or 



Z,7T 

k = —— , A independent of t 

A 

(1.42) 

Now 

A v v 

(1.43) 

or 

wavelength x frequency = phase velocity 

(1.44) 

Note: e ±i{u,t - 

~ kz ) has phase (tot - kz). This phase is constant when 



tot - kz - constant 

(1.45) 

or when 

constant io constant 

2 =--- + —t =--- + vt 

k k k 

(1.46) 


Therefore, the planes of constant phase move with velocity io/k = v in the +z 
direction; hence the name phase velocity for v that we have been using. 


Spherical Waves of Definite Frequency 

Spherical waves of definite frequency are given by 

V (x,t) = / + (2)e iwt + /-(SK^ = -G(r =f vt ) 

r 

= - [A + e +i » (vt * r) + A.e-^ (vtTr) ] 

= A + - - +A_- - (1.47) 

r r 

where 

- = *=Y ( L48 ) 

v A 

as in the plane wave case. 

There exists a relationship between plane waves of definite frequency and spher¬ 
ical waves of definite frequency. 
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Consider the functions defined by the integrals 


/ „±ikR 

dA- e~ eR (a definition!) (1-49) 

R 

infinite 

plane 


where e~ eR is a convergence factor and we will take the limit £ -* 0 + after the 
integration has been done. If we were to take e -* 0 + inside the integral, then 
e~ eR -*■ 1 and the resulting integral is undefined or so it seems (as will be seen 
below). 

Since all area elements on the circular ring shown in Figure 1.8 below have the 
same R 



Figure 1.8: Integration Details 


we can write (using Figure 1.8) 


4 = 


r=o 

/ 

r =0 


g±ikR 

27 tv dr - e~ eR 

R 


(1.50) 


which is just integrating over the rings. 

Now R = yjx 2 + y 2 with z fixed during the integration. Therefore, 

dR 1 

dR = ——dr = —rdr 
dr R 

with R going from R= z (when r = 0) to R = oo (when r = oo). Therefore 

R=oo 
R=z 


r=oo 


4 = 


= [ 2ndRe {±ik ~ e)R = 2?r 

J ±ik - £ 

r =0 


(1.51) 


(1.52) 
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The value at the upper limit vanishes because e ±lk ^°°^e~ £ (°°^ = 0 v for e > 0. If 
we had let e = 0 at the beginning of the calculation, then the value at the upper 
limit would have been e ±lk ^°°\ which is undefined (it just keeps on oscillating). 
Thus, 

J ± = lim — T — e (^ k -c)* = ± — e ±ifcz (1.53) 

e^0+ ±ik - £ k 

so that we get the amazing result (limit £ -» 0 + understood) 

1 , r p ±ikR 

e ±lfez = ±— / dA- - e~ eR (1.54) 

2m J R y 1 

infinite 

plane 

where R = distance from dA to a fixed point located a distance z from the plane. 
This then implies that 


q g+i(ajt-kz) _j_ q g-i(cut-kz) 


k_ 

2ni 


if 


dA < 


-eR 


entire plane 
1 to z-axis 
at z =0 


c + 


0 +i(u)t-kR ) 


R 


+ C-- 


-i(ojt-kR) 

R 


(1.55) 


Thus, a plane wave propagating in the +z direction is equal to a linear su¬ 
perposition of spherical outgoing waves emanating from each point on a plane 
perpendicular to the z-axis, that is, 


^+i(cut-kz) 


_fc 

27ri 


if 


dA e 


-eR 


entire plane 
1 to z-axis 
at z =0 


0 +i(ojt-kR ) 

R 


(1.56) 


Note A 

A classical physical quantity(for example, E or B ) is a real quantity. Therefore, 
dphysicai(x,t) = real linear superposition of cos (ojt-k-x) and sin (cvt-k-x) is a 
plane wave of definite frequency, and r] p hysicai(x,t) = real linear superposition of 
cos (cot, - k • x)/r and sin (cut - k- x)/r is a spherical wave of definite frequency. 


For definiteness, let us consider plane waves (similar results hold for spherical 
waves). 


A real linear superposition of cos (tut-k-x) and sin (uit-k-x) can be written as 


Vphysicai(x,t) = Acos(k ■ x - u)t + </>) , Aureal 


We let 

r](x,t) = Ae^e^-^ = Ze< k *- Ut ) , Z = complex number 


(1.57) 


(1.58) 


22 



so that 


r]phvsicai(x,t ) = Acos(k-x- ut + 4>) = Recil(r](x,t )) (1.59) 

Therefore, it is sufficient that we consider r/(x, t) = Z e H fe,a: - wt ) j n our discussions. 
When r] p hysicai (x,t) is required, we need only take the real part of ?;(£’, t). We do 
this because rj(x,t) is simpler than TJ p h ys icai(x,t ) to manipulate in calculations. 


The analogous result for a spherical wave is obvious: 

gi(kr-ujt) 

r)(x,t) = Z- - outgoing wave (1.60) 

r 

?l(x,t) = Z - incoming wave (1.61) 

r 

with 


riphysical = Real(r]) (1.62) 

The linear superposition of spherical outgoing waves which gives a plane wave 
can therefore be used in the earlier form(1.55) with C+ = 1 and C_ = 0 


^ +i{cot-kz ) 



entire plane 
1 to z-axis 
at z =0 


g+i(ujt-kR) " 

R 


(1.63) 


As I said, we shall do our calculations with ?y(i, t ) and ri p hy S icai(x , t ) is recovered 
at any stage of the calculation simply by taking the real part. 


Note B 

If r/(x,t ) = F{x)e~ lut has definite frequency, where F(x) = A{x)e 1 ^^ and A, ip 
are real, then we have 

V P hysicai(x,t ) = Real(ry(*,t)) = A(x) cos (ut - ip(x)) (1.64) 

The intensity of a wave = magnitude of energy flow per unit time per unit area. 
This means that 

intensity °c (v P h y sicai(x,t )) 2 = intensity at x and t (1.65) 

For example, the Poynting vector S = c(E x I?) /47t gives the energy flow per 
unit time per unit area or the intensity. For a plane wave of definite frequency 
propagating in free space (vacuum), | E\ = \B\ and ElB with ExB n the direction 
of propagation. 
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Therefore, 


T T 

_ If 2 1 f 2 

intensity oc — J dt(rj physica i(x,t )) =— J dt (A(x) cos (ujt - ij)(x))) 

o o 

ujT 

= -( A(x )) 2 f d(ujt) cos 2 (cut - 'ift(x)) 

ojT J 

0 


(Mi )) 2 

2n 


2tt 

/« 


du cos 2 (it - tp(x)) 


(^)) 2 

2 




( 1 . 66 ) 


Thus, 

(v P hy S icai(x,t)) 2 = i|r?CM)] 2 (1.67) 

which is independent of t since 77 oc e _I “f Clearly we have the result 

intensity oc |?y(x,f)| 2 (1.68) 


1.2.4. Diffraction of Waves 

Referring to Figure 1.9 below, we choose the origin of the coordinate system to 
lie in the region containing the aperture. 



Figure 1.9: Opaque Screen with Holes 


We assume a wave incident from the z < 0 region given by 

riincident(x, t) = e * kZ ~ Ut) (1.69) 

In general, the screen will affect the incident wave so that there will be scattered 
waves (back into the 2 < 0 region) and diffracted waves (waves in the z > 0 
region). 
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Now 


Vincident(x, t) — C 


_ +i(cot-kz) _ 


2m 


f 


dAe~ ER 


entire 
z=0 plane 


0 +i(cot-kR) 

R 


(1.70) 


This would be the wave for 2 > 0 if no screen were present. This plane wave is a 
linear superposition of spherical outgoing waves emanating from each point in 
the 2 = 0 plane. The Kirchh.off approximation for the diffracted wave is simply 
a linear superposition of spherical outgoing waves emanating from each point in 
the openings in the screen with each of these spherical waves having a coefficient 
equal to the coefficient in the expansion of r]i r iddent(x,t ) = e^ kz ut \ Note that 
the diffracted wave contains no spherical waves emanating from points on the 
opaque screen itself. 


^diffracted (•£> f) 



openings 
in the 
opaque 
screen 


^+i(ut-kR) 

R 


(1.71) 


where R is the distance from dA to the point x. 


This result seems reasonable, but note that we have proved nothing!!! To prove 
that this gives a good approximation for ^diffracted(®) t) requires an analysis of 
the solutions to the classical wave equation and the boundary values of these 
solutions at the screen and in the openings. The Kirchhoff approximation is a 
good one under the following conditions: 

1. r » A -* hr » 1 and r » linear dimensions of the region containing 
the apertures. Thus, we must be far from the apertures for the above 
expression for ^diffracted (x, t) to be valid. 

2. 9 « 1, that is 7/diffracted (5, t) should be evaluated with the above expres¬ 
sions only when x makes a small angle with the 2 -axis. 

3. A « linear dimensions of the region containing the apertures (high fre¬ 
quency limit). We will apply the above expression to situations in which 
this condition is sometimes violated - in such cases our results will only 
be qualitatively accurate. 

4. If the wave is described by a vector held (E and B), then there exist 
relationships among the various components. These relationships have 
not been taken into account in the Kirchhoff approximation and therefore 
this approximation has neglected all polarization effects. 

Note: When the apertures on the screen are finite, the integral is over a finite 
area and the limit e -* 0 + may be taken inside the integral (with e~ sR -* 1) 
because the resulting integral will be well-defined and exist. 

One must do the integral before letting e -* 0 + only when the integration extends 
over an infinite area(which is unphysical anyway!). 
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Application 


We now discuss diffraction by two small holes in the opaque screen. The exper¬ 
imental configuration is shown in Figure 1.10 below. 


We shall calculate;j,only on this line (the 
line through the geometrical projection of the 
two given holes as shown by the arrow) in order 
to simplify the calculation 



opaque screen with 2 very screen for viewing 

very small holes of area A/1 

".ADnuW 


Figure 1.10: Two Small Holes 


Looking from the side we get the diagram in Figure 1.11: 



Figure 1.11: Side View 
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We have 


^diffracted(*£ i — + _ f dAe 

Z 7 TI J 


0 i(kR-ujt ) 


R 


openings 


k 


/ 


= +— / dA- 

2iri 

openings 


p i(kR-ujt) 


R 


(1.72) 


where we have taken the limit inside the integral. For small openings this gives 


^/diffracted (•£? ^) 0 

27 Tl 


0 i(kR-cjt ) 
r 

i(kR-cdt) 


k . . e 

=-A A - 

2ni r 


-AA + 


1 + 


i(k(r+a sin 0)—ujt ) 

r + a sin 9 

Aka sin 6 


-AA 


1 + - sin 9 


Since r >> asin0 we have 


| | k ^ I i ikasin6\ 

K7diff = o- 1 + e 

zir r 


and 


intensity °c \r] d iff\ = 


2 fc 2 (AA) 2 


47T 2 


(2 + 2cos (/casing) 


intensity oc 1 + cos (kasm9) 

A typical interference pattern is shown below in Figure 1.12. 


(1.73) 


(1.74) 


(1.75) 



ka 

Figure 1.12: Interference Pattern 
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Note(as shown) that the first zero of the intensity occurs at sin# 
fore, 

2tt A 

fca sin# = —a sin# = n -*■ a sin# = — 

A 2 


7 r /ka. There- 


(1.76) 


for the first zero. 


1.3. Review of Particle Dynamics 

The following discussion is relativistically valid, that is, it holds for particles 
traveling at any speed v < c. 


1.3.1. Relativistic Dynamics 


- dp _ _ 1 

r = — with p = m'yv and y = — . 

dt j 1 - v 2/ c 2 


(1.77) 


where to = particle’s rest mass (it is constant and does not change with v). 


Non-relativistically (v « c) implies that 7 rs 1 and p « mv. The kinetic energy 
(KE) of a particle is defined such that the work done by F on the particle equals 
the change in KE. Thus, the definition of kinetic energy is 


AK = K - Kn = 


r r 

fF-ir-f 


dp 

— • dr 
dt 


(1.78) 


Now, we have that p = to ”/(v)v where j(v) = (1 - /3 2 ) 1 / 2 , /? = v/c. Therefore we 
have 

^ d V 

K - K {) = —(mo'y(v)v) ■ vdt = to° J~v-d( j(v)v) (1-79) 

ro 0 

Since the kinetic energy is zero when the velocity is zero we finally have 


K = toq 


u 

J v ■ d{j(v)v) 


Now since 


d( yu 2 ) = d (yu • v) = v • d( yu) + y v ■ dv 


(1.80) 


(1.81) 
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we can write 


V V ^ V 

K = toq J~ {d{^v 2 ) - ■ dv) = tuq d^v 2 ) --m^ J 7 d(v 2 ) 


= m 0 "/v 2 - -m 0 c 2 


du 


= nio'yv 2 + rrioc 2 1 — 11 = toqc' 


*"* / ^ 

J~ jdu 


( - - 1 ) = m 0 c 2 ^ 7 / 3 2 + -m 0 c~ 


\/1 - u 
2 1 


= to 0 c 2 (7- 1) 


(1.82) 


The first thing we should do is check that this makes sense. What is the low 
velocity limit of this expression? 


Using 

7=(l-/3 2 )- 1 / 2 ^l + ^ 2 = 1 + ^ (1.83) 

we have 

o L t/ 2 1 

A' = m 0 c 2 (7 - 1) -> woc 2 -^- = -m 0 v 2 (1.84) 

as expected. 

If we rearrange this result we have 

7 WoC 2 = K + ?ti 0 c 2 (1.85) 

= Energy (motion) + Energy(rest) 

= Total Energy = E 

The total energy is conserved. 


What is the connection to momentum? Some algebra gives the following results: 

pc _ jrrpyvc _v_ _ ^ ( 1 . 86 ) 

E 'yrrioc 2 c 

and 

\ 2 

J -p 2 = TOgC 2 = invariant (1-87) 

We now turn our attention to the so-called wave-particle duality exhibited by 
physical systems - a given physical system sometimes exhibits wave-like prop¬ 
erties and sometime exhibits particle-like properties. This experimentally ob¬ 
served duality will suggest how to calculate the probabilities p(x,t)drx and 
p(p,t)d 3 p. 
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1.4. Wave-Particle Duality of Light 

These results actually apply to all electromagnetic radiation. 

1. The wave nature of light is manifest in the diffraction (interference) pat¬ 
terns observed when light passes through apertures in an opaque screen. 

2. The particle nature of light is exhibited in the photoelectric effect (as well 
as many other experiments). 

In the experiment shown in Figure 1.13 below, the incident light causes pho¬ 
toelectrons to be emitted at the cathode (alkali metal plate held at negative 
potential). The voltage V gives rise to a current I as the photoelectrons are 
collected at the surrounding anode. 



Figure 1.13: Experimental Setup 

For a given metal and fixed frequency v , the dependence of I on V is observed 
to be as shown in Figure 1.14 below: 



Figure 1.14: Experimental Data 

The photoelectrons leaving the metal have different kinetic energies. When V 
is sufficiently high, all electrons leaving the metal will be collected at the anode. 
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A further increase in V will not, therefore, increase I because the number of 
electrons leaving the metal per unit time is determined by the incident inten¬ 
sity. 

As V is lowered (but still positive), the slower electrons emitted from the metal 
will not reach the anode (ordinarily all the electrons would be attracted to the 
anode for V > 0, however, the electrons already traveling to the anode tend to 
repel newly emitted photoelectrons. 

As V is made negative, still fewer photoelectrons reach the surrounding conduc¬ 
tor. However, a non-zero current can still flow because those electrons emitted 
with very high kinetic energy can overcome the negative potential difference. 

At V = -Vo, those electrons emitted with the maximum kinetic energy will 
barely be able to overcome the negative potential difference and then V < -Vo 
will therefore give zero current. 


Thus, Vo is a measure of the maximum kinetic energy of the emitted photoelec¬ 
trons (charge q = -e, e > 0) so that 


OVatmetal ~ (-^ B)at surrounding conductor Q^at surrounding conductor 
(K E ) a t metal — (K E) at surrounding conductor + q(Vat surrounding conductor Vat metal) 
(K E ) at metal — (KE)at surrounding conductor + qV 

When photoelectrons barely get to anode (KE) atsurroundingconductor = 0 so that 


(KE) 

at metal ~ -eV (1.88) 

and we have 

(AVE) max = eVo (1.89) 

Vo versus frequency v (for a given metal) is shown in Figure 1.15 below. 



depends on 
metal used 


Figure 1.15: Vo versus v 
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For v < t'o, no current flows (regardless of how intense the incident light beam 
is). 


The classical wave theory of light cannot explain these observations. Indeed, the 
classical theory requires that the electric field \E\ increases as the intensity of 
the light increases (intensity oc \E\ 2 ). Since the light’s electric force on an elec¬ 
tron in the metal is qE, the maximum kinetic energy of emitted photoelectrons 
should increase as the light beam is made more intense. Furthermore, since 
\E\ is independent of u, the photoelectron’s maximum kinetic energy should 
not depend on v. In particular, a current should flow for any frequency of the 
light, provided the light beam is intense enough. But the observations show 
that (KE )max = eVo is independent of the intensity but depends on v current 
flowing for v <v$\ 

One additional point: Since the energy of the classical wave is distributed over 
the entire wave front, a single localized electron in the metal should absorb 
only a small part of this energy (the energy incident on the effective area of the 
electron). Thus, for a beam of very low intensity, there should be a time lag 
between the time the light first impinges on the metal and the time the pho¬ 
toelectrons are emitted (a time interval during which an electron in the metal 
can absorb enough energy to escape from the metal). No such time lag has ever 
been observed! 

To explain the photoelectric effect, Einstein, in 1905, proposed the photon con¬ 
cept - a concept which attributes particle-like properties to light. 


1.4.1. Einstein’s Photon Hypothesis 

1. The energy in a light beam is not continuously distributed over space but 
is localized in small bundles called photons. The energy of each photon is 
proportional to the light wave’s frequency 

Ephoton ~ hu (1.90) 

where h is a proportionality constant, called Planck’s constant (introduced 
in 1900 by Planck to describe black body radiation). 

2. The intensity of the light beam is proportional to the mean number of 
photons traveling per unit area per unit time. 

3. An electron bound in the metal can absorb a photon and thereby gain 
its energy. The probability that a single electron absorb 2 photons is 
negligible, and an electron bound in the metal can only absorb a whole 
photon, never part of a photon. 

It is easy to see that these 3 assumptions explain the photoelectric effect com¬ 
pletely. 
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{ energy absorbed 
by one electron 


hv 


{ energy for electron 
to get to metal surface 


{ energy for electron 
to leave metal surface 

Wq (work function of metal)>0 



electron's KE 
after leaving metal 


The electrons which absorb a photon at the metal’s surface (none wasted) have 
the maximum possible kinetic energy 

hu=W 0 + {KE) max - eV 0 = (^)max = hv-W Q (1.91) 

This gives the required linear relationship between (A'A) max and v. It also 
shows that ( KE) max is independent of the incident intensity. Note that for 
v < Wo/h no electron will absorb enough energy to leave the surface. Thus, 
the cutoff frequency vq = Wo/h and v < vq imply that no photoelectrons will 
be emitted. When the light beam’s intensity is increased, more photons hit the 
surface of the metal per unit time and therefore more electrons will leave the 
surface per unit time when v > vq. 

Also note that the emission of photoelectrons with no time delay is an immediate 
consequence of the localization of the photon and its energy. 


By measuring the slope of the Vq versus v curve and by knowing the electron 
charge q = -e one finds the experimental result 

h = 6.63 x \W 27 erg - sec (1.92) 

It is interesting to note that the human eye can detect a single photon in the 
visible range {y « 10 15 sec -1 ) or 

E = hv « 7 x 10- 12 erg = 4.4 eV (1.93) 


Now since light waves propagate at c, photons must travel at speed c. Two 
relations we wrote down earlier, namely, 


pc v 
~E~c 


(?) : 


2 _2 2 
- p = m 0 c 


(1.94) 


then say that the rest mass of the photon is zero and E = pc. We then have 

E hv h. 


^ c c A 

where p is the magnitude of photon momentum. 


(1.95) 
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E = hv and p = h/X relate the particle properties E and p to the wave properties 
v and A. 


There are situations in which the wave properties and the particle properties of 
light are manifest in different aspects of the same experiment. 

Consider the double slit experiment in Figure 1.16 below. 


we detect the number of 
photoelectrons emitted at 



Figure 1.16: Double Slit Experiment 

If the incident beam is of very low intensity (say 1 photon per minute), then 
one can observe individual photoelectrons being emitted from different points 
on the screen. Each photoelectron is emitted at a point where the photon has 
struck the screen. 

The number of photoelectrons emitted over a period of time at various points 
on the viewing screen is observed to be given by the wave diffraction pattern. 
Thus, the probability of a given photon hitting the viewing screen at a given 
point (this probability is proportional to the number of photons which hit the 
screen at the given point over some time interval) is given by the wave diffraction 
pattern. 

This is the key point! 

The probability of observing the particle (photon) in a 
given region is given by the intensity of the wave 
(diffraction pattern) in that region. 
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Thus, we have a statistical connection between the wave properties and the 
particle properties! 


Is light a wave or a particle? It is really neither! It is a physical system in 
which the probability of observing a photon is determined by the intensity of a 
wave. Indeed, if light propagation were just a wave phenomenon, then one could 
not explain the photoelectric effect. On the other hand, if a light beam were 
actually composed of well-defined localized particles (photons), one could not 
explain diffraction patterns. For example, consider the two experiments shown 
in Figure 1.17 below: 


one opening 
covered with 
opaque material 



/ 


photons hit the 
screen with a 
fairly uniform 
distribution 
8<<1 


both opening 
are present 



photons hit the 
screen with a 
diffraction pattern 
distribution 
0<<1 


Figure 1.17: Which Slit? 

If the light beam were actually composed of localized photons, one could not 
explain why one opening leads to a uniform distribution while adding another 
opening (which increases the number of ways a photon can get to any point on 
the viewing screen) yields certain regions of the screen in which the probability 
of observing a photon decreases. 


1.4.2. Wave-Particle Duality of Electrons 

1. The particle-like properties of an electron are well-known. For example, 
one can view the trajectory of an electron in a bubble chamber. If the 
chamber is placed in E and B fields, then the observed trajectory is just 
the one determined by Newton’s second law with the Lorentz force 

^ = F - q(E + - x B) , p = m^v (1.96) 

dt c 

2. The wave-like properties of an electron are exhibited in the diffraction of 
electrons by a crystal (Davisson and Germer, Phys Rev. 30, 705 (1927)). 
The experiment is shown in Figure 1.18 below: 
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oooooooo 

oocuooooo 


atoms in crystal 
structure 


Figure 1.18: Electron Diffraction 


The experimental distribution of electrons )shown on the right) is a typical 
diffraction pattern of a wave having wavelength 

A = - (1.97) 

P 

which is called the deBroglie wavelength . The probability of observing 
the particle (electron) is determined (in this experiment) by the intensity 
distribution of a wave. 

This relationship A = h/p or matter waves (whatever they might be!) was 
predicted in 1923 by Louis deBroglie. He argued that matter should exhibit a 
wave-particle duality just the way light exhibits such a duality - he argued that 
the relation A = h/p should relate the wave and particle properties for matter as 
well as for light. 

Davisson and Germer confirmed this hypothesis in 1927 when they scattered 
electrons from a nickel crystal and observed a diffraction pattern distribution. 


Why isn’t the diffraction of macroscopic objects observed? 

Consider the experiment shown in Figure 1.19 below. The first minimum occurs 
for 

dsind = — -»■ sin0 = — (1.98) 

2 2d 
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Assume that the incident particles are macroscopic, that is, 


m = 1 gm , v = 1 cm/ sec 
-*■ p = 1 gm • cm/sec -»■ A = — « 7 x 10~ 27 c?ti 


so that the first minimum occurs at 


sin# = 


1 A 
2d 


7 10 27 cm 
2 d 


(1.99) 


For any realistic d, this yields a 9 so small that the oscillations in the diffraction 
pattern cannot be observed. Thus, A « d for macroscopic objects and diffrac¬ 
tion patterns cannot be resolved - in this short deBroglie wavelength limit, one 
obtains classical mechanics. The wave-particle duality is present. It is just that 
the wave patterns are too fine to be resolved. 


In addition, if v were as small as 10 8 cm/sec (an atomic distance per second) 
for macroscopic particles, 

1 gm, p rj 10~ 8 gm• cm/sec, A « 7 x 10~ 19 cm (1.100) 

Again, for any realistic d , the diffraction pattern still cannot be seen. 


All physical systems (light, electrons, neutrons, baseballs, etc) exhibit a so- 
called wave-particle duality\ The connection between the wave properties and 
the particle properties is statistical - the intensity of the wave in a given region 
determines the probability of observing the particle in that region. Wave and 
particle properties are related by the de Broglie relation A = h/p. 
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Chapter 2 


Formulation of Wave Mechanics - Part 1 


Using the ideas from Chapter 1, we can now set up a version of quantum theory 
called Wave Mechanics. 

2.1. Basic Theory 

2.1.1. Postulate la 

Motivation: probability oc |jy| 2 

Given the initial conditions (to be specified later) and the particle interactions 
(forces acting on the particle ), there exists a complex-valued function ifj(x,t), 
called the wave function or the probability amplitude, such that 

1. The quantity 

J d 3 x\i/j(x,t)\ 2 (2.1) 

all space 


is finite and non-zero. 

2. The probability is given by 


p{x,t) 


IV’C^U )! 2 

f d 3 x\'tp(x,t)\ 2 

all space 


( 2 . 2 ) 


Note that no time averages are taken. 

From now on f d 3 x( .) means an integration over all space. 

all space 
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Additional Notes 


1. The condition 


non-zero 


J d 3 x\ip(x,t)\ 2 is finite and 

is referred to by saying that ip(x,t) is normalizable. 

2. p(x,t) = probability of observing the particle in ( x,x + dx ), ( y,y + dy ) and 
(z, z + dz) at time t. 


3. 


p(x,t) = 


\ip(x,t)\ 2 


f d 3 x\ip(x,t) | 

all space 


satisfies the required conditions for a probability density: 
p(x,t) > 0 

J~ d 3 xp(x,t ) = 1 for all t 

4. ip(x, t ) need not be a continuous function of x. The only requirement is 
that it be normalizable. We will, however, assume continuity for physical 
reasons. 

5. A plane wave of definite frequency is not normalizable. 


_ i(k-x—ujt) 


= e l(kx 

=► J~ d 3 x\i/j(x,t)\ 2 


M)| 2 = i 


6. t) must be allowed to take on complex values. If il)(x,t = 0) is real, 
then the time dependence that will be postulated later will usually yield 
a ip(x,t > 0) that is complex. 

7. ip(x, t) and ij;(x,t) = Zi/}(x,i), Z complex, determines the same p(x,t). 


p(x,t) 


_ | 

/ d 3 x\ij)(x,t)\ 

all space 

jZ\ 2 \Hx,t)\ 2 
\Z\ 2 f d 3 x\i/>(x,t)\ 2 

all space 


p(x,t) 


8. ij)(x, t ) should be thought of as a construct of the mind to facilitate the 
prediction of probabilities. It is meaningless to ask whether ip(x, t ) re¬ 
ally exists as a physical quantity. p(x,t ) is the measurable quantity, and 
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ip(x,t) only helps us to calculate p(x,t). It is analogous to the situation 
with E and B fields - E and B are mental concepts which allow us to 
calculate the force exerted on one charge by another charge (we think of 
one charge producing E and B fields which propagate to another charge 
and which then give rise to a force on the other charge). 

9. One wave function ip(x,t) is adequate to describe certain particles, called 
spinless particles (for example, an alpha particle). However, other parti¬ 
cles (particles with spin, for example, electrons, photons) require a wave 
function with several components for an accurate description (for exam¬ 
ple, a photon requires a vector wave function [3 components] to describe it 
because of its various modes of polarization). For non-relativistic speeds, 
an electron is approximately described by one wave function [one compo¬ 
nent]. We shall restrict our attention to one-component wave functions 
during most of chapters 2 and 3. 

Let f(x) be an arbitrary function of x. Then the average (or expectation) value 
of f(x) is defined by 


(/(£)) = f d 3 xp(x,t)f(x) 


f d 3 x \ip(x,t)\ 2 f(x) 
f d 3 x |0(x,t)| 2 


(2.3) 


which may change with time. Therefore, 

= f d 3 x^*(x,t)f(x)^(x,t) 
f d 3 xip*(x,t)’ip(x,t) 


(2.4) 


Definition 

ip{x, t ) is said to be normalized if 

= l ( 2 - 5 ) 

Let ip(x,t) be a normalizable wave function. Then 

i>(x,t) 

2 ( 2 - 6 ) 

is obviously normalized and determines the same p{x,t ) as ip(x,t). 

If if){x,t) is normalized, then 

p{x,t) = \ip(x,t)\ 2 and (f(x)) = J d 3 x^*(x,t)f{x)il)(x,t) (2.7) 
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Definition 


A function <fi(x) (may also depend on t) is square-integrable if f d 3 x\4>(x)\ 2 is 
finite. 

A normalizable function is square-integrable and satisfies J d 3 x \4>(x)\ 2 + 0. 

Theorem 

If </>i(i) and <pi{x) are square-integrable, then \\(j)i{x) + \ 2 <j> 2 {x) is also square- 
integrable for any complex Ai and A 2 . 

Proof: 

J d 3 x\\i(l> 1 {x) + \ 2 (t> 2 {x)\ 2 

= J d 3 x [|Ai| 2 \<t>i(x)\ 2 + |A 2 | 2 |02(S)| 2 + AiA 2 ((»i(J)</< 2 (i) + 

= J d 3 x [|Ai| 2 \4>i(x)\ 2 + |A 2 | 2 |0 2 (5)| 2 + 2Re(AiA 2 </>i(5)^ 2 (5))j 

Now let u\ = X\(j)\{x) and u 2 = A 2 ^ 2 (i). Then 

\u\u 2 \ = ^/[Re('u^M 2 )] 2 + [Im(u^u 2 )] 2 > Re(uiU 2 ) 


Also, 


[Kl - ju 2 1] 2 = K| 2 + |u 2 | 2 - 2\ulu 2 \ > 0 -> Ku 2 | < ^ [|ui| 2 + |u 2 | 2 ] 
Therefore, 

Re (ulu 2 ) < \u\u 2 \ < ^ [k| 2 + |u 2 | 2 ] 

and 

J d 3 a;Re(AiA 2 0i(i’)<( , 2(5’)) ^ f c? 3 a;{|Ai| 2 |<^i(ai)| 2 - 1 - |A 2 | 2 1^> 2 (^)| 2 } 

so that 

J d 3 x\Xi4>i(x) + X 2 4 > 2 (x)\ 2 < J d 3 x||Ai| 2 |^i(J)| 2 + |A 2 | 2 |0 2 ($)| 2 | = finite 

Therefore, f d 3 x |Ai<^>i (x) + X 2 (f> 2 (x)\ 2 is finite and Ai0i(i) + A 2 ^ 2 (5) is square- 
integrable. 

This theorem implies that the set of all square-integrable functions forms a linear 
vector space (with the usual addition of functions and the usual multiplication 
of a function by a complex constant) - this space is designated L 2 . 
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2.1.2. Postulate lb 

Motivation: Linear superposition of classical waves. 


If ipi(x,t) and ip 2 (x,t) are possible wave functions for a particle under the 
influence of given forces, that is, ip\{x,t) and ip 2 (x,t) could correspond to dif¬ 
ferent initial conditions), then, for any complex constants Ai and A 2 , \\(j)\{x) + 
\ 2 <p 2 (x), which is necessarily square-integrable by the previous theorem, is a 
possible wave function for the particle under the influence of the given forces, 
provided that \\(j>i(x) + A 2 </> 2 (i) is not identically zero at any time, where a 
function f{x , t) is identically zero at time t if f(x, t) = 0 for all x. 

Let 4> i(x) and <fr 2 ( 2 ;) e L 2 , that is, they are square-integrable. Then, 

1/ d 3 X(/)l(x)4>2(x)\ < J d 3 x|</>*(i)^ 2 (5)| 

< 7) f d 3 x [|0i(rr)| 2 + |</> 2 (ai)| 2 ] = finite 
so that / d 3 x (j)l(x)(j) 2 (H) is finite. 

Definition 

Let 4>\(x ) and </> 2 (i) e L 2 ■ The scalar product(inner product) of 4>i{x) and 
(j) 2 ( 5 ) is defined to be 

{(t>i I <M = (0i,02> = J d 3 x(j)* 1 (x)<l)2(x) (2.8) 

where the 2 nd expression is the standard scalar product notation, the 1 st ex¬ 
pression is a different notation (due to Dirac) for the same thing and the 3 rd 
expression is the actual definition. The scalar product is finite for 4>i(x) and 
fo(x) e L 2 . 

Properties of this Inner Product 

(obvious by inspection) 

1- (</>i I fa)* = (</> 2 | <t>i) 

2 . 


( 4 >l | A 2 ^2 + A 3 <^>3) - A 2 ((/>1 I fo ) + A3 (<)>1 I 4 > 3 ) 

{^2<t>2 + A 3^*3 | (j>\) - Aj {4>2 | (t>l) + A 3 ((j )3 | 4>i) 

3. {4>\<j)) is real with ( (f>\(f>) > 0, where equality occurs if and only if </>(5 = 0 
almost everywhere(a.e.), that is, <p(x at all x with the possible exception 
of some isolated points. 
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Definition: 0(x is normalized if (0|0) = 1 

Definition: 0i(x) and 02(5) are orthogonal if (0i Ifc) = 0 

Definition: Let f(x ) be an arbitrary function of x. The matrix element of 
/(5) between 0i(x) and 02(5) is defined to be: 

{<f>i\f(x)\<h) = {<l>ij(x)<h) = J d 3 x<j)l(x)f(x)<j> 2 (5) (2.9) 

where the 2 nd expression is the standard scalar product notation, the 1 st ex¬ 
pression is a different notation (due to Dirac) for the same thing and the 3 rd 
expression is the actual definition. 


In the Dirac notation, we have (0i|/(5) |02), where 

(0i| = bra vector , \4>2) = ket vector 


and 

(0i| fix) |02 ) = bracket f(x) 

In Dirac notation, Postulate la takes the following form: 


1 . 

2 . 


(0| 0} is finite and non-zero 

4>(x,t)\ 2 


p(i,t) = ^ 


Also 


{fix)) 


(0| f(x) |0) 

(0 I 0) 


If 0(5, t) is normalized, then we have 


<0 I 0> = 1 , p{x,t) = |0(5,t)| 


{fix)) = (01/ {x) | 


( 2 . 10 ) 

( 2 . 11 ) 


( 2 . 12 ) 


(2.13) 


Theorem: Schwarz Inequality 

Let 0i (x) and 02(5) 6 L 2 ■ Then 

1(01 | 02>| 2 < (01 | 01) (02 | 02> (2.14) 

where the equality occurs if and only if 02(5) = A0i(5) a.e. for A a complex 
constant. 


Proof 

Consider 02(5)-A0 i(5 ) with A arbitrary. Then we must have (02 - A0i | 02 - A0i) > 
0 for any A. Again, equality occurs if and only if 02(5) = A0i(x) a.e. We then 
find 

(02 - A01 | 02 - A0i) = (02 I 02} + |A| (01 | 01) — A (02 | 0l) - A* (01 | 02) > 0 
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This must be true for all A. In particular, it is true for 


A = 


(0i I 02) 


(0i I 0i> 

Note that if (0i |0i) = 0, then the theorem is trivially true since 

1(01 | 02>| 2 = o =((/>! | 01 ) (02 I 02 ) 

Using the choice for A in (2.15) we have 


(2.15) 


(02 | 02 ) + 


(01 I 02 > 


(01 I 01) 


/ j. mi (01 I 02) / j. | \ (01 I 02) 

(01 | 0l) ~ , , , , \ (02 | 01) — 


(01 I 01) 


(01 I 01) 


(01 | 02) > 0 


(02 | 02 ) - ^ > 0 => |( 0 i | 02 ) | 2 < (01 | 01 ) (02 | 02 ) 

(01 I 01) 

with equality if and only if 02(5) = A0i(5) a.e. 


Note: The above inner product, its properties, and the Schwarz inequality can 
be generalized in an obvious manner to functions of n variables as shown below: 

1. 0(xi,., x n ) is square-integrable if f d n x<p*(x i, ,x n )<f>(xi, . ,x n ) 

is finite, where d n x = dx\dx 2 . dx n . 

2- (0i | 0 2 ) = I d n x(j)\{x 1 , ,x n )<j) 2 (x 1 , ,x n ) 

3. (0i| /(aq, ,x n ) |0 2 ) = / d n x<f>l(x 1 , ,x n )f(xi, ,x„)0 2 (a;i,. ,x n ) 


Examples 

Let 0i(5) = e~ r ! 2 and 02(5) = re _r ^ 2 , then 

(0i I 0i) = f d 3 xcj)l(r,9,ip)(j) 1 (r,9,ip) 

00 7 r 27r 00 

= J r 2 e~ r dr J sin 9d9 J dip = 4 tt J r 2 e~ r dr = 8 tt 
0000 

where we have used 

00 

J du u n e~ u - n! for n = 0,1,2,. 

0 


Similarly, 

(02 | 02) = y~ d 3 a:02(r,6»,^)02(r,6»,^) 

OO 7T 27T OO 

= J r A e~ r dr J sin 9d9 J~ dip = An J r A e~ r dr = 967 t 
0000 


(2.16) 
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and 


(01 I 02 ) = f d 3 x<l)l(r,6,ip)(j)2(r,6,ip) 

OO 7T 27T 

= J~ r 3 e~ r dr J~ sin Odd J~ dip 
ooo 


47r 


/ 


r 3 e r dr = 247r 


We then have 


1(01 | 02>| 2 = 5767T 2 , (01 ] 0l) (02 | 0 2 ) = 7687T 2 

so that 

1(01 | 02>| 2 < (01 | 01 } (02 | 02 } 

as required by the Schwarz inequality. 


(2.17) 

(2.18) 


Now let k be some fixed vector. Then we have 

(021 k * x |0i} = d 3 x(f>2(x)(k ■ i)0i(i) (2-19) 

Now we are free to choose our coordinate system for specifying the spherical 
polar coordinates of x so that the z-axis is along k (remember that k is fixed 
during the integration) as shown in Figure 2.1 below. 



Figure 2.1: Vector Orientations 


We then have 


and we get 


k • x = kz = kr cos (9 , d 3 x = r 2 smddrdddip 

( 02 1 k • x |0i) = J d 3 x(j)* 2 {x){k ■ x)4>i{x) 

OO 7T 2-7T 

= r 2 dr J sin 6dO dip (kr cos 0)(re~ r ) 


o o 


OO \ / 7T \ 271 

= k | J r A e~ r dr 11 J sind cos 9d6 j J 


2_7T 

dip 

\o , 


= /c(4!)(0)(2tt) = 0 
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Now let <j>(x) = 1/?’, then we have 


{4>\4>) = f d 3 x(t>* (r, 9, tp)(f>(r, 9, p) 

OO 7T 27r ^ oo 

= J~ r 2 dr J sin 9 d9 J~ dp—= Air J dr - 


o o 


The integrand diverges at r - oo. Therefore, <f>(x) = 1/r is not in L 2 . 
Now let <j>(x ) = 1/r 2 , then we have 

{4>\4>) = f d 3 x(t>*(r,9,f)(j){r,d,f) 

OO 7T 27r OO 

= r 2 dr J sm8d9 J df-^ = 4tt J 


dr 


The integrand diverges at r - 0. Therefore, <j>{x ) = l/?’ 2 is not in L 2 . 


Note: Now 

OO 7T 27T 

(0k> = / r 2 dr J~ sin 9d9 j~ dip\4>(x)\~ (2.20) 

oo o 

Therefore, for the integration over r to converge, it is clearly necessary for 
|<^(x)| -* 0 sufficiently fast as r -» oo. Thus, square-integrable functions must 
vanish as r -»■ oo. 


If the wave function i/j(x,t ) for a particle is given, we can calculate p(x,t) from 
postulate 1. Postulate 2 will tell us how to calculate p(p,t) (the momentum 
probability distribution) from 0(x,t). For this, we must introduce the Fourier 
transform of t). 


2.1.3. Fourier Series and Transforms and Dirac Delta Func¬ 
tion 

Fourier Series 

Let f{x) be a square-integrable function of one variable on the interval [-a/2, a/2], 
that is, 

a/2 


-a/2 


Then 


f dx\f(x)\ 2 

i/2 

f(x) = f) C r 


is finite 


ilEJML 

e l « 


( 2 . 21 ) 


( 2 . 22 ) 


with convergence a.e. 
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Let 


Now, 


u n (x) = 


1 ( 2irnx i . . 27rna:j 


cos ■ 

/« \ a 


+ z sin ■ 


f(x) = X C nUn(x) 


a/2 a/2 

(«« | w m ) = J dxu* n (x)u m (x) = ^ J dxe l ~ (ym ~ n) 

-a/2 -a/2 


az —(to - rz) 

a ' ' 


^ ^iTr(m-n) _ ^—i7r(m—n) ^ 


1 


= 5 . 


27tz(to - n) 

1 for n = to 
nm ^ 0 for n + to 


Kronecker delta 


Therefore, multiplying 

oo 

/(a:) = XI C nUn(x) 

n =-oo 

by u* m (x) and integrating over a:, we get 


(2.23) 

(2.24) 


(2.25) 


a/2 


a/2 


[ dxu* m (x)f(x)= X C « f dxu* m (x)u n (x) 

-a/2 n=_0 ° -a/2 


= X 


nm ~ Cm 


Cm ~ 


a/2 

J dxf{x)e~ l ~ 


(2.26) 


-a/2 


Note that the Fourier series gives rise to an expansion of /(a;) on [-a/2, a/2] in 
terms of the functions e l2 ' Knx / a = e * fea: where 


fc = 


27rn 

a 


27 t a 

T " n 


(2.27) 


We are therefore expanding /(a’) in terms of sines and cosines with wavelengths 
a, a/2, a/3,. 


Now we let a oo, which will give us a heuristic proof of the Fourier integral 
theorem. We have 


f(x) 




(2.28) 
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varies from -oo to + oo in steps of A k = 2n /a , (An = 1). Then a -» oo =>• A k 
0. Therefore, 


fix) 



a(A k) 


a{Ak) 

27r 


2i r 


= 1 


Now let 


so that 

00 Ak 

fix) = E e ikx F{k)—— 

k=— oo V ^TT 

Then as a -*• oo => A/c -*• 0, we have 


(2.30) 


(2.31) 


(2.32) 


/w - / 


n/2^ 


F(fc) 


(2.33) 


F ^-^- f wJ {x)e " k " {2M) 

-a/2 

which is the Fourier Transform of f(x). Thus, we have the Fourier Integral 
Theorem (heuristically proved): 


If f{x) is square-integrable, the Fourier transform 

a/2 

F(fc)= f 

-a/2 

exists a.e. and is also square-integrable. Furthermore, 


dx 

n/2tt 


fix)e 


-ikx 


/M - / 


ikx 


F(fc) 


(2.35) 


(2.36) 


For a more rigorous derivation see Physics 50 (Mathematical Methods) notes 

http://www.johnboccio.com/courses/Physics50_2010/006_FourierTransform. 
pdf. 



Example 

Let f{x) be a Gaussian f(x) = Ne~( x ~ x °^ / 4cr , where Xq, cr and N are real and 
o > 0,N > 0. First we determine N so that f(x) is normalized. 


1 = 


J" dx\f(x)\ 2 = N 2 J~ dxi 

— oo — oo 

oo oo 

= N 2 J due~ u2l2a2 = N 2 oV2 J dve~ 

— oo 

1 = N 2 o\Fht -* N = 
so that the normalized f(x) is 




mo 


f(x) 


as shown in Figure 2.2 below. 



e -(x-x 0 ) 2 /4cr 2 


(2.37) 



Figure 2.2: Gaussian Function 
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Now let us calculate the Fourier transform. We have 


F(k)= 7 L 

J \Z2tt V \/27rc 


1 

\Z2n V \f+na 
1 


- e ~( x ~ x ° ) 2 /4cr 2 e -ikx 


dxe- (x - Xo)2/4a2 e- ikx 


\/2tt 

1 


+ f 

hT<7 

— oo 

I - oo 

A /— L- [ due~ u214 ” 2 e~ ik{u+Xo) 

V \/2ttct J 

oo 

/ 


\/27r V \f^K(? 

i r~r- 


n/^ttV n/2tF. 


-ikxo 


e - ikx °e 


TT<7 


due“(“ +2?;feo ' ^ ^ 4<t e“ 4cr fe ^ 4<t 


CT2fc2 f e ~^ U+2ikG2 ^ 


where the last two steps follow from completing the square. We then have 

oo+icrk 


F(k ) = 


2(7 


V >/27r< 


_ e ~ikx 0 e ~<7 2 k 2 


/ 




(2.38) 


-oo+icrk 


This last integral can be done using complex integration methods (again see 
Physics 50 (Mathematical Methods) notes), which would show that it does not 
matter whether we integrate over the path in the complex plane as indicated 
(fcO) or along the real axis (k = 0). 


We will now show this property another way. We define 

oo+icrk 

y2 

dZ e~ z 

— oo+icrk 


00 + 10 
Hk). f 


(2.39) 


We now show that dl/dk = 0, so that I(k ) is independent of k. Let Z = v + iak. 
Then 


J <‘> ■ / 


dve 


-(v+iak) 2 


(2.40) 


so that 


dJ_ 

dk 


oo oo 


OO oo 

ia / dv it.{ e ~ (v+i,Tk)2 ) = ia / d ( e ~ (v+iak)2 ) 


= ia [e 


-(v+iak) 


i: 


= 0 
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Therefore, I(k ) = constant = 1(0) and 



(2.41) 


(2.42) 


Figure 2.3: Gaussian Fourier Transform 


Note that 



— oo 


This will be seen to be Parseval’s relation , which we derive shortly. 


Now we generalize to 3 dimensions. Let f(x) = f(x,y,z ) be a square-integrable 
function of 3 variables. We can Fourier transform in z first, then in y and finally 
in x. 


F(k) = F(k x ,k y ,k z ) 

oo 

dx 


r dx ik r 

J J 


d 3 x 


dy 

n / 2 n 


-ikyy 


OO 

/ 


dz 


-ik z z 


f(x,y,z ) 


f 7^72^/(5) 

i, (27 t) 3/2 


(2.43) 
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where d 3 x = dxdydz and k ■ x = k x x + k y y + k z z. Similarly, we have 
f(x) = f(x,y,z) 


r dkxjk x x r 


dk. 


\[Fk 


\/27r 


y_ e ik yV 


/ 


\/27r 


e^ z F(k x ,k y ,k z ) 


oo ^ 

= f (2.44) 

i, (2^) 3/ 

which is obtained by Fourier inverting first in x, then in y and finally in 2 . It is 
the inverse Fourier transform. 


Thus, if f(x) is square-integrable, the Fourier transform 

00 3 

F(k) = f 7^4 ^f{x) (2.45) 

-l ( 27 t ) 3/2 

exists a.e. and is also square-integrable. Furthermore 

fix) = f -^e^F(k) (2.46) 

-l ( 27 t ) 3/2 


Parseval’s Relation 


Let F(/c) and G(/c) be the Fourier transforms of f(x) and g(x) respectively. 
Then 


(f\g) = (F\G) 


(2.47) 


that is, 

J d 3 xf*(x)g(x) = J d 3 kF*(k)G(k) (2.48) 

Proof 


f d 3 xf(x)g(x) = f d 3 x f ^*e~**F*(k) 

J J J (27r) ' 


a(x) 


f d 3 kF*(k) f -^e-^g(x) 
J ll ( 2 nf 2 

/■ 


d 3 k F* (k)G(k) 
Note: For / = g we obtain (/1 /) = (F | F), that is, 

J d 3 x\f(x)\ 2 = J d 3 k \F(k)\ 

which will be very useful. 


(2.49) 

(2.50) 
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The Dirac Delta Function(a generalized function) 

We define the function 5 e (x) as shown in Figure 2.4 below. 


* 6 t (x) 



Figure 2.4: Delta Function Definition 


by the relation 

S s (x) = 

Taking the limit e -*■ 0 + we get 


[0 for |x| > e/2 

11/e for |x| < e/2 


.. r . . 0 for x * 0 

lim dAx) - f 

e- > o + ) oo for x = 0 


(2.51) 


(2.52) 


such that the area under curve remains = 1. 

The problem is that such a limit does not exist as an ordinary function. Instead, 
consider 


lim dx f(x)6 e (x - Xq) 

— oo 

where the function 5 e (x - x o) is as shown in Figure 2.5 below: 


(2.53) 



Figure 2.5: Another Delta Function Definition 
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and f(x) is some arbitrary function continuous at x = Xq. One takes the limit 
after the integral has been done. We then have the result (which is the correct 
defining relation) 


lim f dx f(x)S e (x - Xo) = lim - f dxf(x) 
e-*0 + J ' e-> 0 + £ J 

-§+x 0 

+ |+a;o 

7 /(^ 0 ) f dc 


= lim - j 

e-*0 + £ 


- 3 + 2:0 


= lim -f(x 0 )£ = f(x 0 ) 
£^0+ £ 


(2.54) 


where we have used the continuity of f(x) at x = Xq to take the term f(x 0 ) 
outside the integral. 


Definition 

Let <j e -(x - xq) be any function of the type described above and depending on 

some parameter e. Then, lim 5 e (x - £ 0 ) = S(x - xq), which is the Dirac delta 

£-*•0+ 

function, means, for any function f(x) continuous at Xo, 

00 00 

lim f dx f(x)5 e (x - x 0 ) = f(x 0 ) = f dx f(x)6(x - x 0 ) (2.55) 

e->0+ J J 

— OO —OO 


where the last form is just a notation because lim cannot be brought inside 
the integral to give 


J dxf(x) 


lim 6 e (x - Xq ) 
£-»-0 + 


(2.56) 


since 


does not exist. 


lim 5 e (x - xq ) 
£^0 + 


(2.57) 


Note: Compare the delta function with the Kronecker delta: 


Cj Sij = Ci picks out j = i 
j 


00 

/ 


dx f(x)5(x - £ 0 ) = /(x 0 ) picks out x = xq 


There are many different functions <5 e ( x-xo) such that lim S e (x-x 0 ) = S(x-Xq) 
in the above sense. 
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Properties of the Delta Function 

1. In a non-rigorous way, 


lim 6 £ (x) 

e->0 + V ' 


10 for x t 0 
I oo for x = 0 


(2.58) 


2. For /( x) continuous at x = 0 


J dxf(x)6(x) = /(0) 


3. In particular 


4. Let 


g(x) = 


X - Xq) = 1 

f(x) for a; e (a, 6) 
0 for a; ^ [a, 5] 


then 


J ~ dx f(x)6(x - Xo) = J~ dxg(x)5(x - xq) 

a -oo 

(/(a;) for x e (a, 6) 


= ff(xo) = 


0 for a; ^ [a, 6] 


5. For any real constant a 

Proof 

For a > 0 




For a < 0 


oo oo 

J dxf(x)5(ax) = |^| J duf(u/a)6(u) = ^f(0) 

— oo —oo 

oo — oo 

J ~ dx f (x)S(ax) = - J~ duf(u/a)S(u ) 

X3 

OO 

f duf(u/a)S(u) = -r|/(0) 


1 

lal 


(2.59) 


(2.60) 


(2.61) 


(2.62) 
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But, 


f dxf(x)^yy = 7—T/(0) 

J a a 

— oo 

Therefore, 


6. It follows from (5) that 


S(-x) = S(x) 


Fourier Transform of a ( 5 -function 
Heuristic: Fourier transform (this is well-defined) 


A (*) = / 



e~ tkx 6(x) 



Therefore, 

OO oo 

( 5 (x) = J ^e zkx A(k) = J C ^e ikx 

— oo —oo 

where this integral is really not defined! 

Rigorous: Let 

OO 

S e (x) = J c ^L e ±ikx e ~ ek2 


(2.63) 


(2.64) 


(2.65) 


( 2 . 66 ) 


- h 2 

where e is a convergence factor which gives a well-defined integral for e + 0. 
We ask the question: does lim 5 e (x) = (5(x) in the same sense as the earlier 

definition? 


lim J dx f(x)5 e (x) = lim J dxf(x) J ^-e ±lkx e ek 

— OO —oo —oo 
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We can now take the limit inside the integral if and only if the resulting integral 
is well-defined. We have 


iim f dxf(x)S £ ( a) = lim f -^Le efe2 F(*k) 
e^O+ J e^0+ J s/2n 


\/2tt 


T/ (k,) = m 




Therefore, 


5(a) = lim f ^ e ±ikx e 
e->o+ J 2n 


-ek z 


(2.67) 


which we will write as 


oo 

= f 


dk 

2n ( 


±ikx 


( 2 . 68 ) 


with the above limit in mind. 


Derivatives of a 5 -function 

Let lim S e (x) = 5(a:). Then 


which means 


j. dS £ (x - x 0 ) 
£—*o + dx 


dS(x - Xo) 
dx 


(2.69) 


J= lim 

J dx e^ 0 + J dx 


= lim [f(x)6(x-x ■ 0 )]!° oo - lim f dx S e (x - x 0 ) 
£->o+ £->o+ J dx 

— oo 

.-lim f dx SMs ! ( x - x „).-'S^«l 

e^o+ J dx dx 


where we have used integration by parts and 5 e (±oo) = 0. Therefore, 

(2.70) 

J dx dx 

—oo 

when df(x)/dx is continuous at x = Xq ■ Similar results hold for higher derivatives 
of S(x - xq ) (just keep on integrating by parts). 


58 



Three-Dimensional Delta Function 


We will write all integrals with the limiting process understood where appropri¬ 
ate. 

[ d 3 xf(x)6 3 (x-x 0 ) = f(x 0 ) = lim f d 3 xf(x)S 3 (x-x 0 ) (2.71) 

J £-*0 + J 

for all functions f(x) continuous at x = Xq. But 

f(x o) = J dxdydz f(x,y,z)5(x - x 0 )S(y - y 0 )S(z - z 0 ) (2.72) 

or 


S 3 (x - x 0 ) = S(x - x 0 )6(y - y 0 )S(z - z 0 ) 
d 3 k 


/ 


(2nY 


0 ±ik-(x-x o) 


(2.73) 


2.1.4. Postulate 2 

Motivation 

ip(x, t) is square-integrable which implies that we can Fourier analyze it in x (t 
fixed) 

d 3 k 


= J 


e +lk - x ^(k,t) 


(2.74) 


(2tt) 3 / 2 

This is just a superposition of sines and cosines with wavelengths A = 27 t /k. But 
A = h/p (deBroglie) relates this wave property (A) to the particle property (p) 

by 

(2.75) 


27T 27T p 

~ ~A “ ~h P “ h 


where we have defined h = h/ 27t. Then we have p = hk. Since the spatial 
variation of e +lk ' x occurs in the k direction, we expect p to be along fc’s direction, 
that is, p = hk, that is, ip(x, t) is a weighted (weighted by something related to 
probabilities) superposition of states of different momentum. Now, Parseval’s 
relation gives 

(ip | if>) = (<fi | <j>) - d 3 k\(f>(k,t)\ (2.76) 

Therefore, 

d 3 p\fi(k,t)\ 2 


I 


drk U(fc,£) = 1 = 


/ 


h 3 (i/j | if>) 


But p(p,t) must satisfy 


/ d 3 pp(p,t) = 1 


We are therefore led to the conjecture that 


p(P,t) = 


i |<KM)| 

h 3 (ip | ip) 


(2.77) 

(2.78) 

(2.79) 
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Let 

4>(P,t) = ^V>(M) ( 2 -80) 

with p = hk. Therefore 

\'ib(p,t)\ 

P( ftt ) = /TT7\ (2 - 81) 

W I W 

where 

■/( (2 - 82) 

which is the Fourier transform of ^(S, t) in momentum space. We also have 

«*.») = / (2,83) 

Note that Parseval’s relation becomes: 

(Vh I V^) = (V’l,^) = {v\,<j> 2 ) = (V >1 | </> 2 ) 

= J d 3 k(j>\{k,t)(j)2{k,t) = J d 3 pV>i(fc,f)V> 2 (M) 

= (Vh | V’ 2 } = (VtV^) 

Therefore, 

(V'l I ■02> = (Vh I V’ 2 } =► (V’ I V’> = (V> | V’} (2.84) 

Postulate 2 

Let V’(^) £) be the wave function for a particle under the influence of given forces. 
Then, 

U(p,t)\ 

« = W (2 ’ 85) 

where ip(x,t ) and ip(p,t) are related as in (2.83). 

Recall that postulate 1 says that 

**■*>-^ < 2 -> 

Thus, the position distribution and the momentum distribution are related!!! 
The relationship is somewhat complicated because tp(x, t ) and i/j(p, t ) are related 
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by Fourier transformation in momentum space. 

We recall that the average value of f(x ) is given by 

(/(»)> = f d 3 xip*(x,t)f(x)^(x,t) 

= ^ M/(£)h/>) ( 2 -87) 

Now consider the function g(p). We have for the average value of g(p) 

■ ~ ,2 

= f d 3 pp(p,t)g(p) = J d 3 p g(p) 

= f d 3 p^*(p,t)g(p)iip,t) = (V’| sKp) |7) 

It would be somewhat simpler if we could calculate {g(p)) in terms of ip(x,t), 
so that we would not have to Fourier transform -0(5, t) explicitly. 

Consider ( pi) where 


Pi = Px, P 2 = Py, P 3 = Pz, X\ =x, X 2 = y, X 3 =z 


We have 


( Pi) = f d 3 p^*(p,t)pii/j(p,t) 

= f d 3 p^>*(p,t)[piip(p,t)] (2.88) 

Using Parseval’s relation we have 

W = / d 3 xtjj*(x , t) (Inver seFourierTransf orm\jpi'ij)(p,t )]) (2.89) 


Now 


[Pii>(p,t)]= pi J 

■I 


d 3 x 


e lh ' x ip(x,t) 


(27rft) 3 / 2 
d 3 x ( h d 


(27Tft)3/2 




7 


ft 
+ — 


surface at oo 2 J (27T ft) 3 / 2 


/ 


d 3 x -iP.$di>(x,t) 


dxj 


i J (2n ft ) 3 / 2 


d 3 a; -a.$d^{x,t) 
e fe 


cte, 


(2.90) 
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where we have integrated by parts in dxi and used the fact that ip(x, t) = 0 at 
r = oo so that all surface terms vanish. Therefore, 


(. Pi) = /,),, f d 3 xip*(x,t)-^-ip(x,t) 
\W\ip) J l OXi 


(2.91) 


Consider now 


{PxPyPz) = 


1 


(i> I H 

1 

{ 4 > I H 

1 

it | ip) 


J d 3 pip* (p, t)p x p™p z ip(p, t ) 

f d 3 pijj*(p,t)[p^p^p^tp(p,t)] 

J d 3 xip*(x,t ) ( InverseFourierTransform[p x p^ I p^'ip(p,t )]) 


As before 


PxPy*Pz 


*' f) = /(H/MHI) HI) HI) '"'’H*’ 0 


Integrating by parts L + M + N times where all surface terms vanish, we get 


f d3x c -it* f 

/ft dl 

L 

f ft d ) 

M 

( hd ) 

N A 

V’(M) j 

/ (27rft) 3 / 2 ( 

V i da:/ 


U dy) 


[i dz) 


Therefore InverseFouriei'Ti'ansfoi'm[p x Py I p^’ip(p,t)] is 


Vida:/ V i dw / \i~ 


. N 


dz) 


ip(x,t ) 


and we obtain 


{PxPyPz) = 




r l3 ./ft d \ L /ft d \ M //i d V Y , 

J \i ox) \i oy) \i dz) 


Thus, if g(p) has a Taylor expansion about p = 0, g(p) = a sum of terms of the 
form ppppf p]) . Since the average of a sum is the sum of the averages, we have 


(g(p)) = 


1 


(ip\ip) 


r ,o . / ft d ft d ft 3 \ . _ 

/ drxtp (x, t)g [)ip(x,t) 
J \i ox i oy i dz) 


where 


H 


ft d ft d ft d ' 
dx' i dy 1 i dzt 


acts on ip(x,t). If we use 


/ d d d 11 
V = { dx'dy'dzj 


(2.92) 

(2.93) 

(2.94) 
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we have 

(g(p)) = f d 3 xip*(x,t)g^-\7^ip(x,t) 

~MV} Ws &) w {2M) 


We now 

define the momentum operator 



t> 

i-C^ | *<s> 

II 

A. 

o 

(2.96) 

that is, 

h d 

Pj ’° p = ld^ 

(2.97) 

Then 

(g(p)) = ^ | ^ {^\g{po P )\^) 

(2.98) 

and 

(f( x )) - ^ | ^ M f(x) Wl 

(2.99) 


and we are able to do everything using i))(x,t). 
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2.1.5. Operator Formalism 

An operator maps functions into functions. Consider an operator A ('signifies 
an operator). We have 

A(/>x(x) = </> 2 (S) =► A(j>i = (j>2 (2.103) 

where the last form is standard and the variable(position) dependence is under¬ 
stood. 

In words, the operator A acts on the function <fii(x) to produce the function 
<t> 2 (x). 


Examples 

1. A(j) = Xj<t> => multiplication by Xj 

2. A<j> = f gfr 4> =► momentum operator 

3. Acj) = Xj(j) + j or A = Xj + | g|- 

4. Acj)(x) = f d 3 y K(x,y)<f>(y) or A =► an integral operator 
5 - ^ = 

Definition: An operator is linear if A(Ai</>i + \ 24 > 2 ) - ^iA<j>\ + \2A1p2 for all 
complex constants Ai and A 2 and for all <f> 1 and 4> 2 - 

In the examples above (1), (2), (3) and (4) are linear operators and (5) is not 
linear. From now on, all our operators will be linear. 

The momentum operator and the kinetic energy operator are both linear! 


Example 

Consider 

A=^~ , B = x 

OX 

First 

AB+^-x= 1 (2.104) 

ox 

To calculate AB correctly, we must compute ( AB)<f> = A(B(f >) with an arbitrary 
<j>. We have 


(AB)<t> = A(B(f)) - —(x</>) 




(2.105) 


64 



so that 


(2.106) 


d 

AB = 1 + x— 
ox 

Definition: The commutator of 2 linear operators A and B is defined to be 
[A, B] = AB - BA (unlike numbers, operators do not necessarily commute). 

Example 


[*U, Vj,op\ 0 


h <90 h d(xi4>) h <90 h <90 h dxi 

-rXi — — — ~ ~Xi~ ~Xi~ — t — < 

i oxj i axj i axj i oxj i axj 

h dxi h 


so that 


[X {, Pj ,op~\ — Ai.Sf 


Using similar algebra, we find 


[xi,Xj] = 0, [pi,op,Pj,o P \ = 0 using 


<9 2 


<9 2 


dxidxn dx-jdxi 


(2.107) 

(2.108) 


These commutators will be of fundamental importance in our development of 
quantum theory. 


Now, recall the definition of the inner product 


(011 

02 ) = J d 3 x 0j(J)0 2 (J) 

(2.109) 

Thus, 



(0i A(j> 2 ) = (0i | 0} = J 

d 3 x(f>i(x)ip(x ) = d 3 X(j)\{x)A(j) 2 (x) 

(2.110) 

{Mi 02> = (<; | 02) = f 

d 3 xq*(x)(j> 2 (x) = J d 3 :r[A 0 i] 02 (i) 

(2.111) 

where [A0i] means that the operator A acts on 0 \ to produce £ and then it is 
complex-conjugated. 

Definition: A linear operator is hermitian if 



(0 M) = {M 0 ) 

(2.112) 

for all 0(i) on which A is defined. Note that 



(0 M) = (M 0 )* 

(2.113) 

implies that an operator A is 

i hermitian if {(f>\ A0) is real for all 0(i) . 

. Xj and 


Pi,op - -ihd/dxj are hermitian. If AB is hermitian, then BA is hermitian only 

if [A,B] = 0. 
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2.1.6. Heisenberg’s Uncertainty Principle 

We have seen that the probability distribution in x (position) and the probability 
distribution in p (momentum) are related, The relationship involves the Fourier 
transform. For p = hk we found earlier that 


*<*■<>= / (^e‘W(MW(M) = 


p(x,f) 


(■0 I V >) 


++p{p,t) 


( 010 } 


1 |^(M) 
ft 3 (V’ | V') 


In wave mechanics, Heisenberg’s Uncertainty Principle relates the rms deviation 
of the position distribution to the rms deviation of the momentum distribution. 
Such a relationship follows from the above Fourier transform relationship. 


Heuristic Argument: We have 

0OM) = f ^^e + *UKM) (2.114) 

Let y = z = 0 and consider the dependence on the ^-coordinate. Let <fi(k,t) be 
real for simplicity and consider Real(ijj(x,t)). Therefore, 

R e(ip(x,t)) = J ^ 3/2 cos (k x x)<t>(k,t) (2.115) 

that is, Real(ijj(x,t )) = sum of cosine terms, each with a different wavelength 
\ x - 27 v/k x . As shown in Figure 2.6 below for two such cosine terms with 
different wavelengths, we have regions where the waves are in phase and regions 
where the waves are out of phase. 


waves in phase -» constructive interference 
waves out of phase -»■ destructive interference 

Real(ip{x,t )) will be negligible in the regions where the cosine terms interfere 
destructively. In the a:-region where the cosine terms interfere constructively, 
Real(ip(x,t )) will be non-negligible (similarly for Imag(iJj(x,t))). 
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qx k, j)cos(k u x) ^jt, ,r ) C o s(*,,jr) 

wvaelength = A, maelengih - A, 


Figure 2.6: Wave Interference 


Now consider the more general case: </>(fc,t) is complex and x, y, and 2 are 
arbitrary. Let Ax be the x-region in which ip(x,t) is non-negligible (t fixed, y 
and z fixed). Let A k x be the k x integration region over which (f>(k,t ) is non- 
negligible. ip(x,t) is a superposition of terms e lkxX of wavelength A^, = 2n/k x . 
For ip(x,t) to a certain region Ax, there must be constructive interference of 
the e %kxX terms in this region and destructive interference everywhere else. Let 
< t>(k,t ) be non-negligible only when k a < k x < kb ( t, k y , k z fixed ) so that A k x = 
kb - k a . There are 


Ax k a Ax 
A a 27T 


(2.116) 


number of wavelengths in the region Ax when k x = k a and 


Ax fcjAx 
A b 27T 


(2.117) 


number of wavelengths in the region Ax when k x - kb- For the e zka:X terms 
( k a < k x < kb) to interfere destructively at the limits of (and beyond) the interval 
Ax 



kbAx k a Ax 

2tt 2i r 

(2.118) 

must be at least one, that is, 

Ak x Ax 

> 1 

27r 

(2.119) 

(similar arguments hold for the localization in the y and z directions). 
we have 

Therefore 

Ak x Ax > 2-7T 

, AkyAy > 27r , Ak z Az > 27t 

(2.120) 
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or using k = p/h, h = h/2it we have 

Ap x Ax>h , Ap y Ay>h , Ap z Az>h (2.121) 

These are the so-called Heisenberg Uncertainty Relations. If a particle has non- 
negligible probability to be found in a region (Ax, Ay, Az) of x, that is, the par¬ 
ticle is said to be localized within this region, then the probability of measuring 
the particle’s linear momentum is non-negligible in the range (Ap x , Ap y , Ap z ) 
of p with the Heisenberg Uncertainty Relations a necessary constraint on the 
position spread and the momentum range. For example, if we prepare a particle 
at to so that it has non-negligible probability to be found with x-coordinate 
in the interval (xoj^o + Ax) and negligible probability to be found elsewhere, 
then any measurement of p x at to will yield a value somewhere in the range 
( Pxo,PxO + Ap x ) where Ap x > hjAx (xq and po are arbitrary). 

Classically, one specifies the precise position and momentum (or velocity) of 
a particle at some initial time. One then finds the position of the particle as 
a function of time by solving the classical equations of motion with the above 
initial conditions. We now see that this formulation of a particle’s time develop¬ 
ment is impossible quantum mechanically, the Uncertainty Principle prohibits 
the precise specification of both the particle’s position and its momentum at 
some initial time. 

Indeed, Ax = 0 (particle localized at a point) -»■ Ap x = 00 , in some sense, (par¬ 
ticle’s momentum is completely uncertain). 

The spread in momentum necessitated by a spatial localization can be illus¬ 
trated in an electron diffraction experiment as shown Figure 2.7 below. 



Figure 2.7: Electron Diffraction 


Electrons which pass through the slit have a ^-localization Ay at the first screen. 
Most of these electrons arrive at the second screen with 0 <0 0 where 9q locates 
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the first diffraction minimum 

dsin^o = (A y) sin0 o = A = — =► (Ay)(psin0 o ) = h (2.122) 

P 

But p sin Oq = y-component of momentum of an electron (with total momentum 
of magnitude p) arriving very near the first diffraction minimum. Of course, 
no electrons arrive exactly at 9q. Thus, most of the electrons(in first or central 
bump) arriving at the second screen have a y-component of momentum in 
the range [-psin$o,+psin0o] so that A p y rj psin$o and thus Ap y Ay > h in 
agreement with the Uncertainty Principle. 

Discussion 

When electrons are to the left of the first screen, (A p y )i nc id en t = 0- Therefore, 
(Ay) i nc ident = 00 (so that Ap.yAy rj K) and the incident beam is uniformly dis¬ 
tributed over all y. Thus, the wave function 4’incident(x,t) for an electron to 
the left of the first screen is independent of y (so that pincident (x, t ) is indepen¬ 
dent of y). That is the meaning of an infinite plane wave. An electron which 
passes through the slit (and eventually reaches the second screen) necessarily 
has a new Ay = linear dimension of the slit. The electron’s passage through 
the slit constitutes a measurement of the electron’s coordinate to an accuracy 
Ay. This measurement has necessarily changed the electron’s wave function 
because (A y)i nc ident + (A y) s ut- In general, a measurement made on a particle 
will change the particle’s wave function. In this example, tpi nc ident(x,t) is in¬ 
dependent of y. Right after the electron passes through the slit (corresponding 
to a measurement of the electron’s y-coordinate), the electron has a new wave 
function ip new (x,t ) which is non-negligible only in a 5y range (the slit’s width). 
Detection of the electron at the second screen allows us to determine p y of the 
electron which passed through the slit. As we have seen, the spread in such p y 
values is A p y rj h/Ay. It seems at this point that The HUP is implied by the 
fundamental properties of the x and p bases and also the Fourier transform. 
This latter dependence is deceiving and only appears to be true when using the 
x and p bases. 

Rigorous Argument 

Let A and B be two hermitian operators. For example, A and B can be chosen 
from the operators x, y, z, p x ,op, Py,op, Pz.op- Let if(x,t) be the wave function 
at time t and let A A and A B be the rms deviations of measurements of A and 
B at time t, defined by 

(AA) 2 = {(A - {A)) 2 ) = (A 2 ) - (A ) 2 (2.123) 

(AH) 2 = ((B - (H» 2 ) = (B 2 ) - (B) 2 (2.124) 

Now let C = A-(A) and D = B-(B). Because (A) and (B) are real (hermiticity), 
C and D are also hermitian. Now we assume that fi(x, t) is normalized so that 
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(0 | ip) = 1. Then 

( AAf(AB ) 2 = ( C 2 )(D 2 ) 

= (0 | (7(70) (0 | 77770) = (dip | (70) (D0 | Dip) 

where we have used hermiticity in the last step. Then using the Schwarz in¬ 
equality we have 

( AA)'\AB ) 2 > |((70 | L»0)| 2 = |(0 | (7i70)|”' (2.125) 


where equality holds if and only if ((70) = A (Dip) where A = some constant. 
Now we can always write 


CD = ^ (CD + DC) + * ((777 - 77(7) 


(2.126) 


and since (7 and 77 are hermitian, we have 
1 


^ ((777 + 77(7) => hermitian 
- (CD - DC) => anti - hermitian 


so that 


i ((777 + 77(7) 0) =>■ pure real 
^(CD-DC)ip^ =>■ pure imaginary 


Proof 


0 


2 ^ 


= \ (0 | C770) ± i (0 | 77(70) 

= ^ (C0 | 770) ± ^ (770 | (70) 

= ^ (DCp, | 0) ± ^ ((7770 | 0) 

= ^ (0 | 77(70)* ± 1 (0 | C770)“ 


= ± 0 




(2.127) 

(2.128) 


(2.129) 

(2.130) 
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Continuing, we then have 


2 


(. AA) 2 (AB) 2 > 

('0 \(CD + DC)^ 

+ {ip l -(CD-bc)ip ) 

> 

pure real 

[ip l -(cb + bc)ip ) 

pure imaginary 

\ l(cb-DC)ip ) 

> 

[ip l -(Cb-DC)ip ) 

2 


2 


where equality holds if 


Finally, we have 


However, 


'ip ^ (CD + DC) ^1 = 0 


(AA) 2 (AB) 2 >-\[ip\[C,b]ip)\ 


(2.131) 

(2.132) 


[C, D] = (A- (A)) (B -{B))-(B- (B)) (A - (A)) 

= Ab-bA= [A,b] 

so that 

(AA) 2 (AB) 2 >\\{i>\[A, B] 0)| 2 (2.133) 

or our final result is 

(AA)(AB)>±\{i/>\[A,B]iP)\ (2.134) 

where equality holds if 2 conditions are simultaneously satisfied: 

1. (Cip) = A (Dip) for some constant A 

2. (ip | l(CD + DC)ip) = 0 

This result is the Uncertainty Principle. It is a necessary constraint on the rms 
deviations of measurements of A and B when a particle has the wave function 

ip(x,t)- 


This result holds for any hermitian operators A and B. A and B may be chosen 
from the operators x, y, z, p x ,op, Py,op, Pz.op ; however, in later applications we 
will use other operators. 

We note that AH, A B and (ip \ [A, B] \ ip) refer to quantities evaluated at the 
same time t because each has been expressed in terms of ip(x,t) evaluated at 
one fixed time t. 
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Examples 

Using [xi,p jt0p ] = itlSij, [x^Xj] = 0 and [Pi t0p ,Pj,op] = 0 we have 

(Axi)(Axj) > 0 , ( Apj )( Ap,j ) > 0 (2.135) 

and 

(Axi)(Apj) > ^Sij (2.136) 

which agree with the results of our heuristic arguments. 


Let us find ip{x, t) such that the equalities hold in the equations 

AxAp x >^ , AyAp y >^ , AzAp,>^ (2.137) 

For simplicity we work with the x equation. Similar results follow for the other 
equations. Now 

AxAp x = ^ (2.138) 

If 

1. {Cip) = \{Dip) for some constant A 

2. (0 || (CD + DC) ip) = 0 

where C = A - (A) = p x , op - (p x ,op) and D - B - (B) = x - (x). Substituting 
condition (1) into condition (2) we get 

(ip i (CD + DC) = 0 = (ip | CDip) + (ip | DCip) 

= (Cip\Dip) + (ip\D(Cip)) 

= (A Dip | Dip) + (ip | A DDip) 

= A* (0 | Dip) + A (0 | AL>£>0) 

=> 0 = (A* + A) (ip | DDip) 


Since (ip \ DDip) neO, we have A* + A = 0 -»■ A* = -A -* A = ia is pure imaginary; 
a real. The case (ip \ DDip) - 0 , which implies that ({x- ( x }) 2 j = 0 = Ax, is 
not considered here because Ax = ()-*■ A p x = oo. Now 


{Cip) = A {Dip) => {p x , op - {Px)) ip = ia (x - (x)) ip 
( h . !j r {Px) S j‘ip = ia{x-{x))ip 

= T ( Vx)ip~j (x-{x))ip 
ox a a 
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This has solution 


V>(x,t) = F(y,z,t)e^ p * )x e-^ x ~ (x))2 (2.139) 

Note that a must be positive for ip(x,t) to be normalizable. To see the signifi¬ 
cance of a, let us calculate 


We have 


2 


(2.140) 


oo oo oo 

/ dy f dz\F(y,z,t)\ 2 f dxe~^ x ~^ (x - (x)f 

(Ax) 2 = —_^_-_A?_-_ 

f dy f dz\F(y, z,t)\ 2 f dxe~^^ x ~^ 2 

— OO —oo —oo 

/ dxe-^-W 2 (x-(x)) 2 

— oo 

7 dcce -*(*-<*» 2 

— oo 


f due £“ 2 u 2 

— oo 
oo 

f duxe~^ u2 

— oo 


\A ( h\ 3 / 2 

2 V a ) 


ft 


\/A_ / M 1 / 2 2 a 

2 V a ) 


so that 

(x-{x)) 2 

rp(x, t) = F(y, z, t)e*^ Px ^ x e 4 ( a ^) 2 (2.141) 

F(y,z,t) is determined similarly from AyAp y = ft/2 and AzAp z = ft/2. There¬ 
fore, 


(x—(x)) (y—(y)) (z—(z)) 

ip(x,t) = N(t)ei^ Px ' ,x+l ' Py ' >y+l ' Pz ' >z ^e~\- 4 < A ") 2 4 < A «) 2 4 < A *) 2 


(2.142) 


if 


AxAp x = - 


AyAp y = - , AzAp z = - (2.143) 

In the above expression (x), (y), (z ), (p^,), (p y ), (p x ), Ax, Ay, Az are arbi¬ 
trary real numbers. The above expression gives a Gaussian distribution for 


p(x,t) = 


\^x,t)\ 

(rP | rp) 


(2.144) 


For any arbitrary(non-Gaussian) wave function ip(x,t), we must have 


(Axj)(A p^ > ^ 


(2.145) 
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Application of the Uncertainty Principle 

Consider a particle constrained to move on the x-axis (idealized one-dimensional 
problem). F x - F x (x ) is the a;—component of the force acting on the particle. 
This force depends on the position of the particle. Let x = 0 be an equilibrium 
point, that is, F x (x = 0) = 0. Taylor expanding F x (x) about i = 0we have 


F x (x) = F x (0) + 
=o 


' dF x ' 

1 

x + — 

' d 2 F x ' 

. dx _ 

B 

II 

O 

to 

dx 2 


x 2 + . 


x=0 


(2.146) 


For sufficiently small x, we can use 


For 


F x (x) 


dF r 


dF x 

L dx J 


x 

(2.147) 

ix=0 


> 0 

(2.148) 


x=0 


we have unstable equilibrium (F x is directed away from the origin when the 
particle is slightly displaced from the equilibrium position). 


For 


dF x 
_ dx 

we have stable equilibrium ( F x is directed toward the origin when the particle 
is slightly displaced from the equilibrium position). 


< 0 


(2.149) 


Considering small displacements from a stable equilibrium point at x = 0: 

dF a 


F x ( x) = -kx , k = 


dx J x=o 


> 0 


Classically, we have 


so that 


d 2 x d 2 x k 

m—— = F x = -kx -* —— + —x = 0 
at z at- m 


x = Acosuit + B sinuit , oj-\I — 

TO 


The total energy of such a particle is given by 


1 2 1 , 2 V x 1 2 2 

E = -mvt + -kx = + -mu x 

2 2 2 to 2 


Quantum mechanically, 


( E ) = ^( p ^) + l mu}2 ( x2 ) 


(2.150) 

(2.151) 

(2.152) 

(2.153) 

(2.154) 
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when the particle is described by the wave function ip(x,t). 


Note that this is really an assumption because, up to now, we have only con¬ 
sidered (/(x)) and ( g(p )}. We have not yet discussed the quantum mechanical 
computation of (/(x) + g(P))- This is not obvious since x and p cannot be si¬ 
multaneously measured and will be discussed in detail later. 


Now 



(A Px f = (p 2 x ) - {p x f , (Ax) 2 = (x 2 ) - (x) 2 

(2.155) 

Therefore, 



(E) = 

1 (pi) + l mu 2 (x 2 ) > {APx)2 + l moj 2 (Ax) 2 

2 TO ' x/ 2 ' ' 2 to 2 V ’ 

(2.156) 

But, 

> 

8 

IV 

to | 

(2.157) 

from the uncertainty principle. Therefore, 



(-^>^0 , A ,2 + o TOW ( Ax ) = G((Ax) ) 

8to (Ax) 2 

(2.158) 


This result holds for any wave function ip(x,t) since ip(x,t) determines both ( E) 
and Ax. Therefore, ( E) for any i/:(x,t) is > the minimum value of G((Ax) ). 
A sketch of G((Ax) ) is shown in Figure 2.8 below. 


GKAx) 2 ) 



Figure 2.8: G((Ax) 2 ) versus (Ax ) 2 


We find the minimum using 

dG n tl 2 1 1 2 ,4 h 2 

- 9 — = O = — —- 75 - + -TOW -*■ (Ax) =-- - —— 

d{{ Ax) ) 8 TO [(Ax) ] 2 2 4to 2 w 

at the minimum. Therefore, 

<£)>G min ((Ax) 2 ) = ^ 


(2.159) 
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(2.160) 










for any wave function i/j(x,t). 

hix/2 is called the zero-point energy of the particle. Classically, the particle 
could be located at x - 0 with zero velocity so that its energy would be zero. 

Quantum mechanically, the uncertainty principle prohibits the particle from 
having a precise position at x - 0 and, at the same time a precise momentum 
of p x = 0. The required spreads Ax and A p x satisfying AxAp x > h/2 imply 
that the average energy of the particle can never be zero. This has striking con¬ 
sequences when the particle happens to be charged. Classically, an oscillating 
charge would radiate energy and thereby lose energy until it had no remaining 
energy. Quantum mechanically, the particle can radiate energy, but its (average) 
energy can never go below hu>/2. Thus, if the particle has (E) = hu>/2 it cannot 
radiate any energy (even though the particle will, in some sense, be oscillating 
about x = 0). Clearly, the classical Maxwell equations must be modified and 
reinterpreted so that an oscillating charge in its ground state (which necessarily 
has (E) > 0 for a harmonic oscillator) will not radiate! 

I should also like to mention that this zero-point energy phenomenon is re¬ 
sponsible for the stability of matter. One may think of the hydrogen atom, for 
example, as a proton with an electron circling about it. Classically, this accel¬ 
erating electron would spiral into the proton as it continually radiated away its 
energy. The hydrogen atom would collapse! Such a fate is prevented by the 
uncertainty principle, which implies that the electron will have a ground-state 
energy where the electron and the proton have a non-zero separation - the elec¬ 
tron cannot be precisely at the proton’s position with precisely zero momentum. 

In this case, we have 






(2.161) 


and e = electron charge, r = distance between electron and proton. Classically, 
Eiowest - -°°(?’ = 0 ,p = 0). We can obtain a rough estimate of E m i n quantum 
mechanically. We have 

{E)= 'L, {p ' p) ~ e2 [l) (2 - i62) 

where m = electron mass. As a rough order of magnitude estimate we use 

{P"P)* {Pladial) * (A Pradialf 


1 „ 1 
(r) A r 


so that 


(E) » - ~Kr ’ ( A Pradiai) Ar > ^ * h 


(2.163) 
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Therefore. 


( £ >^744=G(Ar)^(£)>G, 


2to (Ar) 2 A?’ 

Minimizing G we have 


mimimum 


dG h 2 1 e 

= 0 =- , . . „ + 


d(Ar) m (Ar) 3 (Ar) 2 

at the minimum. We get 

1 me 


2 * h 2 

Ar = -- 

me z 


(E) > G 


minimum 


2 h 2 


and 


(r) « Ar = -- 

me 2 

as a rough estimate of the hydrogen atom size. 


(2.164) 


(2.165) 


(2.166) 

(2.167) 


Some numbers are: 

h 2 

(r) t a Ar = -- « 10 8 cm = 1 (2.168) 

me 2 

or the uncertainty principle tells us that the hydrogen atom has a radius of 
roughly 1 Angstrom!!! (verified experimentally). We also have 

-Emin = * 10~ lle rg * 6eV (2.169) 

which is very different from the classical Ei owest = -oo!! 


We now return to our development of the quantum theory. 

So far, we have considered the average values of functions of position 

</«> - (2.170) 

and the average values of functions of momentum, 

/ (V'l g(Pop)ip) _ h-. 

(9KP) =- TTTT\ - ’ P°r = “ v (2.171) 

Of course, there are other kinds of functions where the average values are phys¬ 
ically important, namely, functions of both position and momentum A(x,p). 
For example, the angular momentum of a particle is given by L = f x p and the 
total energy of a particle with potential energy V(x) is given by 

E = ^ + y (x) (2.172) 

Postulate 3 is an obvious generalization of the above expressions for (/(i)) and 
(g(p)) to functions A(x,p). 
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2.1.7. Postulate 3a 


For every physical quantity A(x,p ) where A(x,p ) is a real function of x and p, 
there exists a linear operator 

A 0 p — A(x,p op ) , p 0 p = tV (2.173) 

i 

such that for a particle with wave function ip(x,t) 

(2.174) 

(0 I 0} 

is the average value of A(x,p) at time t and 

(f(A(x,p))) = ^ 1 (2.175) 

(0 I 0} 

is the average value of f(A(x,p)) at time t, where f(A(x,p)) is an arbitrary 
function of A(x,p). 


Comments 

1. Postulate 3a agrees with our previous results for (f(x)) and (g(p)) ex¬ 
pressed in terms of the operators f{x) and g(p op )• 

2. Consider A(x,p) = f(x) + g(p). Then (A(x,p)) is given by 


<■ A(x,p )) 


('0 I (f(x) + g(Po P ))ip) 

(0 I 0) 

(0 I /(£)0) + (0 1 ff(p O p)0) 

(0 I 0) 

</(0)) + <s(0)} 


(2.176) 


which we assumed to be true earlier. This result 


if(x) + g(pop)) = (/(£)> + ( g(Po P )) (2.177) 


is not a trivial result. Consider an ensemble of identically-prepared parti¬ 
cles. (f(x)) is the average value of the measurements of f(x) performed 
on many of these particles, while ( g(p 0 p )) is the average value of mea¬ 
surements of g(p 0 p ) performed on many of the particles. To measure 
{f(x) + g(p op )) we must do repeated measurements of the one physical 
quantity f(x) + g(p op ), that is, for each particle in the ensemble we 
must somehow perform a single measurement of the physical quantity 
f{x) + g{pop) and then we must find the average value of such measure¬ 
ments. We cannot perform 2 measurements on each particle, the first to 
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measure f(x ) and the second to measure g(p op ), because the first mea¬ 
surement of f(x ) involves a position measurement which, according to the 
Uncertainty Principle, will change the particle’s momentum distribution 

(Ap.,) after first — ~T (2.178) 

measurement ^ 

where Ax,; is the accuracy of the first position measurement, so that the 
second measurement of g(p op ) does not measure g(p op ) for the particle in 
its original state. 

Thus, it is not a priori obvious that (/(x) + g(p op )), calculated from re¬ 
peated measurements of the single physical quantity f(x) + g{p op ) will 
equal (/(x)) + ( g(p op )}■ While postulate 3a, however, asserts that this 
equality does hold - the above discussion implies that this is not a trivial 
result. 

3. Postulate 3a as it stands, is ambiguous. For example, consider the physical 
quantities Ai(x,p) = xp x and A 2 (x,p) = p x x. Classically, Ai(x,p) = xp x = 
p x x = A- 2 (x,p) because x and p x are just measured numbers. However, 
the quantum operators A\ and A 2 differ: 

A\ op — xp xop j A 2 0 p — p xop x (2.179) 

where 

A\ op ~ A2 0p — = ih r 0 (2.180) 

We, therefore, do not know what operator is associated with the classical 
quantity xp x = p x x. However, we must require that (xp x ) be real (because 
each of the repeated measurements of xp x will be real). Thus, the operator 
A op corresponding to xp x must be such that 

, v (i>\A op ip) 

< xp *> ■ ~mW ( ’ 

is real for any ip(x,t)- This means that (ip \ A op il)) must be real for all 
ip(x,t)- This requires that A op be hermitian. Neither A\ op nor A 2 0p is 
hermitian, that is, we have 

(jp | XPx'lp) = (xxp | p^) = (PxX'lp | Ip) 

—> (ip | Aiop'ip) = (Aiop'ip | 'ip) Ai 0 p ± A2op 


However, we can write 


A(x,p) = xp x =p x x = 


xp x +p x x 


for the classical physical quantity. The operator 

^ _ Xp xop + PxopX 

A °P ~ o 


(2.182) 


(2.183) 
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is hermitian and is therefore an acceptable operator to be associated with 
A(x,p). Ai 0 p = xp X op and A 2 0p = PxopX are not hermitian and are therefore 
not acceptable. In general, we must require that the operator A op correspond¬ 
ing to some physical quantity A(x,p) be hermitian so that 

M> - (2.184) 


is real for any ip(x,t). 


2.1.8. Postulate 3b 

Let A{x,p) be some physical quantity (a real function of x and p). The order 
of the classical variables Xi and pj in A(x,p) is immaterial. The corresponding 
quantum operator A op = A{x,p op ), however, depends on the ordering of the 
non-commuting factors of Xi and pj op . This ambiguity in ordering is (partially) 
removed by requiring A op to be hermitian (so that (A) is real). 

Note: The hermiticity requirement does not completely remove the ambiguity. 
Indeed, hermitian operators differing only by the ordering of Xi and p JO p hr a 
term correspond to the same classical quantity A(x,p) but represent different 
quantum mechanical quantities. For example, let A(x,p) = x 2 p 2 classically. 
Then 

A\op — ^ (*^ Pxop PxopX ) and A 2 op — xp xop x (2.185) 

are two possible hermitian operators. However, they represent different quan¬ 
tum mechanical quantities even though they correspond to the same classical 
quantity. Only experiment can determine which ordering of non-commuting fac¬ 
tors yields the hermitian operator A op that corresponds to a physical quantity 
A(x,p) measured in a specified way. We will not have to worry about such 
problems at this level since there will be no ordering ambiguity in the operators 
we will use. 

Examples 

1. Energy of a particle in a conservative force field 

Definition: The force on a particle is conservative if there exists a func¬ 
tion V(x), called the particle’s potential energy, such that F - -VV\ 

Let W 1 ^ 2 be the work done on the particle by F as the particle moves 
from to x 2 along some path 

ii 

W 1 ^ 2 = J F ■ ds , ds = dxx + dyy + dzz (2.186) 

Xl 
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Therefore, 


r Tr r (ov , ay, ay, \ 

J vv - d “-J V" + &*) 


^2 

f dy = -[y(x 2 )-y(xr)] 


(2.187) 


which is independent of the particular path connecting x\ and i 2 - We can 
calculate W 1 ^ 2 in another way: 


' M -/ 


F ■ ds = / ?n—• ds 
J dt 


— ■ dv = m / u • dv 

dt J 


\ J d(v ■ v) 


1 ^2 1 ^2 
= -mvn —mv i 


(2.188) 


Therefore, 


\= - [V(x 2 ) - y(5i)] 
-mv 2 + V (i 2 ) = -mvf + V (x±) 


(2.189) 

(2.190) 


The total energy of the particle is then defined to be 

E=-mv 2 + V(x) (2.191) 

and the previous argument shows that E 2 = E\ (conservation of energy). 


Since p = mv, we have 


Therefore, 


E =-+ V(x) = H(x,p) = Hamiltonian 

2m 




(2.192) 


(2.193) 


Clearly, in this case, there is no ambiguity in the ordering of factors. 

2. Angular momentum of a particle 


We have 


L X = VPz ~ ZPy 

L = rxp=>-L y = zp x - xp z 
L z = xp y - yp x 


(2.194) 



Therefore. 


L i. 


hid d\ 

A y d*%) 

(2.195) 

( d d \ 

V&c ~ X £h) 

(2.196) 

( d d \ 

{ X dy- y Tx) 

(2.197) 


Again, there is no ambiguity in the ordering of factors because [xi,pj op ] = 
ihSij = 0 for i + j. 


Note: The uncertainty principle 

(AA)(AB)>±\([A op ,B op ])\ (2.198) 

holds for any hermitian operators A op and B op . We may therefore apply 
it to any physical quantities A(x,p) and B(x,p ) (AA and A B represent 
rms deviations in the measurements of A and B). 


The evaluation of commutators [A op , B op ] will be very important throughout 
our discussions. 


Some general rules for commutators 

1. [A, A] = 0 

2. [A,B] = -[B,A] 

3. [A,B+C] = [A,B] + [A,C] 

4. [.A + B,C] = [A,C] + [B,C ] 

5. [A,BC] = [A,B]C + B[A,C] 

6. [AB, C] = [A, C]B + A [B, C] 

7. [A, [B, C]] = [B, [(C , A]] + [C, [A, B]] => Jacobi Identity 

2.1.9. Important Question 

Can we find a wave function ip(x,t) for a particle such that a measurement of 
the quantity A(x,p) performed at time to on identically-prepared particles (each 
particle is described by the same wave function) will always yield the value will 
yield the value “a” . 
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Let 0(5, to) = 0o(5). A measurement of A at time to will yield the value “a” 
with certainty if A A = 0 => (A 2 ) = (A) 2 at to- Now 


(A 2 } = 


Therefore, 


(0o | ^4 2 '0o} _ (A0 O | A0 O ) 

(0o I 0o) (0o I 0o) 

(A0 O | A0 O ) (V’o I V’o) ^ KV’o I A0 O )| 2 


( hermiticity ) 


(V’o I V’o)- 


(V’o I V’o) 


(Schwarz inequality ) 


(yC) > l( »° 1 - 4 *°> l , (n ) 2 


(2A99) 


(V’o I V’o} 

with (V’o | AV’o) real. The equality occurs when A op ifo(x) = A0o(5) a.e. with 
A = a constant. Therefore, (AA) 2 = (A 2 ) - (A)“ = 0 if A op 0 o (5) = A0o(5) where 


_ /4 V (V’o I AV’o) _ (V’o I V’o) _ 

" ' ' " (V’o I V’o) “ (V’o I V’o) 


( 2 . 200 ) 


Important Result 

A measurement of A(x,p) performed at time to yields the value “a” with cer¬ 
tainty ((A) = a and A A = 0) if 0(5, to) = 0o(5) is a normalizable wave function 
such that A op ipo(x) = A0o(5) for all 5 (a.e.). 

This is an eigenvalue equation for the operator A op . The constant a is an eigen¬ 
value and 0 o{x) is an eigenfunction belonging to the eigenvalue a. 

Comments 

1. The eigenvalue equation possesses solutions 0o(5) that are normalizable 
(square-integrable and not identically zero) only for certain values of a. 

2. 0 O (5) ~ 0 satisfies the eigenvalue equation for any a; however, this 0 O (5) 
is not normalizable. 

3. A op 0 0 (5) = A0 O (5) => NA op ipo{x) = _/VA0 o (5), At = constant. Therefore, 
by linearity, 

A op (tV0 o (5)) = A(7V0 O (5)) (2.201) 

Therefore, any non-zero constant multiple of an eigenfunction belonging 
to eigenvalue a is also an eigenfunction belonging to eigenvalue a. 

4. We have A 2 p 0 o (5) = A op A op 0 0 (5) = aA op 0 o (5 ) = a 2 0 o (5) explicitly. In 
general, we have A^0 O (5) = aN 'f’ o(5). 

5. The wave function for a particle will develop in time according to the forces 
present (we will discuss the exact time dependence later). In general, if 
0(5, t) is an eigenfunction of A op with eigenvalue a at time to, it will not 
be an eigenfunction of A op at some other time t\. 
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Theorem 

The eigenvalues of a hermitian operator are real. 
Proof: We have 


Aopipo = Mo 

=► (V’o I Aipo) = (V’o I Mo) = a (V’o I V’o} = (Alp 0 I V’o} = (Mo I V’o} = a* (V’o \ ip 0 ) 

so that a = a* or a is real. 

2.1.10. Time-Independent Schrodinger Equation 

Assume A op = Hamiltonian operator is given by 

= Pop-Pop + = _^1 V 2 + y(g) (2.202) 

2m 2m 

for a particle in a conservative force field. The eigenvalue equation for the energy 
is 

H op ipo(x ) = Etpo(x) (2.203) 

with if the eigenvalue. This corresponds to the partial differential equation 
h 2 

-—V 2 ipo(x) + V(x)ip 0 (x) = E-ipo(x) (2.204) 

2m 

which is the time-independent Schrodinger equation. 

A normalizable wave function satisfying this equation at time to (VK^^o) = 
ipo(x)) describes a particle whose energy at time to is precisely the eigenvalue 
E (A E = 0 - there is no spread in the possible energy values). This equation 
can be solved once the potential energy V (x) is specified, that is, once the force 
F = -VT acting on the particle is specified. 

We postpone solving this equation for various possible forces at the moment 
and, instead, consider the eigenvalue equations for linear momentum, position, 
and angular momentum. 


2.1.11. Some Operator Eigenvalue/Eigenfunction Equations 

Linear Momentum 

Consider 

Px * Pxop = 

The eigenvalue equation is 

h dtpo(x) 
i dx 


h d 

i dx 

(2.205) 

Vxi>o(x) 

(2.206) 
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where p x = eigenvalue. Therefore, 

d'Mx) iPx ^ ro on ^ 

dx = -^Vo(x) (2.207) 

with p x a constant to be determined. The solution is obviously 

ipo(x) = N{y,z)e^ (2.208) 

This is the solution for any complex p x . However, we must find only those values 
of p x for which ipo(x) is normalizable (this condition should, at least, restrict 
p x to real values according to our earlier results). 

Let p x = a + if} , a, 0 real. Then 

(00 I V’o) = J d 3 x\N(y,z )\ 2 |e^| = J d 3 x\N(y, z)f e~~^ (2.209) 

Therefore, 

oo oo oo 

(V’o I '00 ) = f dy J dz\N(y,z )\ 2 J dxe~^ = oo (2.210) 

— oo —oo —oo 

that is, there exists no real 0 for which this integral is finite. Thus, there are 
no physical (normalizable) solutions (in L 2 ) to the eigenvalue equations for p x 
(and similary for p y and p z ). A measurement of Pi on a physical system will 
never yield a specified value with certainty. Measurements of linear momentum 
performed on identically-prepared particles will always yield a spread in results. 
Recall that AxAp x > h/2. If A p x = 0, then Aa; = oo, which is a rather unphysical 
situation. However, we can make A p x + 0 as small as we want, but we cannot 
make A p x exactly zero in any physical system. 

Position 

Consider x 3 = 2 , (x = (x,y,z)). Here X 3 = 2 is the operator which multiplies 
functions by z. The eigenvalue equation is 

z^o (x) = Zipo(x) (2.211) 

where Z = eigenvalue(constant). Therefore, 

(z-Z)il) o(5) = 00 o (a') = 0 (2.212) 

for all z * Z. We claim that ipo(x) = N(x,y)5(z - Z)7. To prove this we must 
show that (z - Z) 6 (z - Z) = 0. 

Let f(z) be a function continuous at z - Z. Then 

00 

f dzf(z)(z - Z)S(z - Z) = [f(z)(z - Z)] z=z = 0 (2.213) 
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for all f(z ) continuous at z = Z. This says that ( z - Z)S(z - Z) = 0. 
Is ip o(®) = N(x,y)6(z - Z) normalizable? 


oo oo oo 

(4>o I tpo) = / dx / dy\N(x,y)\ 2 J dx5(z - Z)S(z - Z) 

— oo — oo —oo 

oo oo 

= f dx J dy\N(x,y)\ 2 [5(z-Z)] z=z = oo 

— oo —oo 


since 5(0) = oo. 

Thus, there are no physical (normalizable) solutions (in L 2 ) to the eigenvalue 
equation for 2 (and similarly for x and y). This says that measurements of the 
position of identically-prepared particles will always yield a spread in results. 


Angular Momentum 

Consider 


Lz -*■ L zop = xp v - yp x = 


h ( d d\ 

7 l X dy V di) 


The eigenvalue equations is 


(2.214) 


-yS-^Mx) = atpo(x) (2.215) 

1 \ ay ox) 

where a = the eigenvalue. 

Consider now a switch to spherical-polar coordinates as shown in Figure 2.9 
below. 



Figure 2.9: Spherical Polar Coordinates 
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We express ipo(x) = ipo(x, 8,cj >) and use the chain rule to find 


dipo dip o 

cte ’ dy 

9 9 t dr 9 98 9 dcp 9 \ I dr 8 88 8 8(j> 8 \ 

dy ^ 8x 1 8y dr dy 88 dy dcp) \ 8x dr 8x 88 8x dcp / 


Now 


and 


so that 


and 


so that 


Thus, 


• = \/x 2 + y 2 


+ z z 


dr y 
dy r 


dr x 
8x r 


cos 8 = 


z 

r 


z 

\Jx 2 + y 2 + z 2 


98 

z dr 

zy 98 

zy 

dy 

r 2 dy 

r 3 dy 

r 3 sin 0 

98 

z dr 

zx 96 

zx 

dx 

r 2 dx 

r 3 dx 

r 3 sin 8 


teaitf) 


y 

x 


sec 2 

sec 2 



dx 



y 


dcj) 1 
dy x 
8<f> 

^ dy 


cos 2 (j) 



( 2 . 216 ) 

( 2 . 217 ) 


d d 

X dy- y dx 


' xy d xyz d 

, r dr r 3 sin 8 88 


2 , 8 \ 

+ cos 


xy d 


8 


y 


d 


xyz d y z 2 

- -r — -- COS <u- 

r dr r 3 sin 8 88 x 2 dcj) > 


8 


8 


= cos (/)— + cos cj>— = cos </>— + tan <j) cos 


dcj) x 2 dcj) 

2 ,9 . 2,3 9 

, cos ^_ +sm f 


dcj) 


Therefore, 


_ h 8 

/ J y — 

i dcj) 


and the eigenvalue equation is 
h dtj)o(x) 


dtp 0 (x) ia 

04, 


</> 


9_ 

dcj) 


( 2 . 218 ) 


( 2 . 219 ) 
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The solution is 


ipo(x) = /(r,6»)e*^ 


( 2 . 220 ) 


for any complex value of the eigenvalue a. 

Important observation 

(f> = 0 and cj) = 2 tt for the same r and 9 correspond to the same point in space. If 
ipo(r,9,<f> = 0) + ipo(r,d,<j) = 27r) then ipo(x) will be discontinuous at this point 
and 

Lzop= l( X dy~ V dx) (2 ’ 221) 

would not be defined on ipo(x). We must, therefore, require that 


ipo(r,0,<j> = 0) = ip 0 (r,0,(j> = 27t) (2.222) 

This is often called the single-valuedness assumption. 

This means that 


/(r, 9)e° = f(r,0)e i2n * -» e i27r * = 1 

^^ = 27 r£ , £ = 0, ±1, ±2,. 

n 

Therefore, a = £h, that is, the eigenvalues of L zop can take on only certain 
discrete values! 

L zop ip 0 (x) = £hi/) 0 (x) , ipo(x) = f(r,9)e ,ec1, , £= 0,±1,±2,. (2.223) 

Is '0o(S) normalizable? We have 


oo 7T 271 

(^o | ^o) - J' r 2 dr J sinOdO |/(r, ^)| 2 J' d(\) |e^| 2 

oo o 

OO 7T 

= 27t J r 2 dr J sin9d9\f(r,9)\ 2 
o o 

Clearly, this can be made to be finite and non-zero with a suitable /(r, 9). Thus, 
there are physical (normalizable) wave functions such that a measurement of L z 
will yield the value ”a” with certainty. Furthermore, the measured value of L z in 
such a system must be one of the values 0, ±h, ±2 h, . L z is said to be quan¬ 

tized. Classically, we expect L z to be capable of taking on any definite value. 

However, we now see that L z can only have the precise values 0,±ft,±2ft,. 

when this particle has a definite angular momentum (A L z = 0). Macroscopi- 
cally, this discreteness cannot be observed because h « 10 -27 erg- sec is so small 
that L z appears continuous with macroscopic measurements. 







Notes 


1. Let ip(x,to) be an eigenfunction of L z with eigenvalue £h. What is the 
probability of finding the particle in a region of space at time to? 


\^(x,t 0 )\ 2 = \f {r,9)e l * <p \ = \f(r,9)\ 2 

(V’o I V’o) (V’o I V’o) (V’o I V’o) 


(2.224) 


which is independent of (j>?. Thus, the particle is equally likely to be found 
at any <f>. 


2. Let to) be an eigenfunction of L z with eigenvalue lh. (( L~) - £h , A L z = 
0). What are (L x ) and {L y )l Since 


LyL z - L z Ly = ( zp x - xp z )(xp y ~ PP X ) ~ (XPy ~ UP X )(Zp x - Xp z ) 

= zp y p x x - zp x yp x - xp z xp y + p~yxp x 

- p y zxp x + xp v xp z + yp x zp x - yp z p x x 
= ( zp v - yp z )p x x + (p z y -p y z)xp x = L x [x,p x \ = ihL, 

we have 


ih (i() Q | L x tp 0 ) = (V’o I L y L z ip 0 ) - (V’o I LzLyip o) 

= (V’o I L y £h^o) - (LzV’o I ij/V’o) 

= £h (V’o I L y 4>o) - {£hi/j o | L y 4>o) 

= th ((V’O | iyV’o) - (V’O | ■byV’o}) = 0 

so that (V’o I La V’o} = (La) = 0 and similarly (L^,) = 0. 


Our basic result was the following: 

A measurement of the physical quantity A(x,p) performed 
at the time to yields the value “a” with certainty if, and only 
if, ip(x,to) - V’o (x) is a normalizable wave function such that 

A op 4> 0 (x) = aip 0 (x) 

Suppose that ip{x,to) is not an eigenfunction of A op . Then, a measurement of 
A(x,p) at time to will necessarily have A A + 0, that is, there will be a non-zero 
spread in the possible results of the measurement. Two important questions 
must then be answered: 

1. If a measurement of A(x,p) is performed at to on such a particle, what 
are the possible results of such a measurement? 

2. What is the probability of obtaining each of these possible results? 
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To answer these questions, it is necessary to develop some more operator con¬ 
cepts. The functions used in the following discussion will be square-integrable 
(they 6 L 2 ). 

Consider the linear operator A = A{x,p). It need not be hermitian for this 
discussion. 


A = sum of terms of the form 


x n y m z > 


gN qM qL 


dx N dx M dx L 

with other possible orderings of the non-commuting operators 

d 


also present, for example 


d 


and 

OXi 

, d N - x 


dx dx N ~ x 

A usually cannot be defined on all square-integrable functions. For 
d/dxj cannot be defined on discontinuous functions. 

In the following, A(f> signifies that A can be defined on this fi. For 
operator term 

C)N a M f)L 

A = x n y m z e ° ° d 


we can write 


{<t>i I M2) = J d 3 xcj)* 1 x n y rl 


dx N dx M dx L 


, t d N d M d L 
dx N dx M dx L 


(2.225) 

(2.226) 

(2.227) 
example, 

a typical 

(2.228) 


= d 3 x[x'~y '~z 

= (-l) JV+M+i J d 3 x 

which defines the operator A'. 


qAT aM c\L 

n n .m _ _ _ _ _ _ 1 

dx N dx M dx L92 

I d N d M d L r 

{dx N dx M dx L 


■ 1 }' 


x n y m Mi\ h (2.229) 


Note that a typical surface term occurring in one of the integrations by parts is 
of the form 


00 00 

J dy J dz [x n y m z t (t>* 1 \ 

— 00 —OO 


d N ~l d M qL 

dx N ~ x dx M dx L 


</>2 


X=+oo 


X=—oo 


(2.230) 


Even though the square-integrable (f>i -» 0 and <p 2 -* 0 as x -» 00 , this surface 
term need not be zero because x n (j) 1 need not go to zero as x -* 00 . Thus, the 
above derivation holds for many but not all </>i and cf> 2 - We shall ignore such 
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difficulties in the following discussions. 

Thus, if A is a linear operator, there exists an operator A t (is read A-dagger) 
called the adjoint operator of A, such that 

(0! |-40 2 > = (At + 0r | 0 2 ) (2.231) 

Notes 

1. (-40 1 | 0 2 ) = {<t>l I -402}* = (A + 4> 1 | 02} * = (01 | -4 + 0 2 ) 

2. The operator A t is unique, that is, there exists only one operator A* 
satisfying (0i | Afo) = (-4 + 0i | 0 2 ). 

Proof: Suppose the operator B also satisfies (0i | -40 2 ) = (f?0i | 02}- 
Then 

(A + 0 1 |0 2 ) = (B0 1 |0 2 ) 

((B- -4 + )0! | 0 2 } = 0 for all 0 2 

Let 0 2 = B<j )i - -4 + 0i, then 

((B - -4 + )0r | (B - A + )0!) = 0 - (B - -4 + )0! = 0 

B(f >i = -4 + 0i for all 0i =► B - A + 

3. A^ is linear. 

4. (A + ) + =A 

5. (A + B) + = A + + B + 

6. (A-4) + = \*A + 

7. ( AB) + = B + A + 

Definition 

A is hermitian if (0i | -40 2 ) = (-40i | 0 2 ) for 0 1 and 0 2 on which A is defined. 
This means that A is hermitian if A t = A. 

Definition 

The N functions 0i, 0 2 ,03,0iv are linearly independent if 

N 

£ Aj0i = 0 (2.232) 

i=l 

a.e. implies that A* = 0 for i = 1,2,TV, that is, 

N 

£ A*0i = 0 (2.233) 

i=l 
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a.e. can be satisfied only if Ai = A 2 = 


Xn = 0. 


Notes 

1. The functions <t>i,<t> 2 ,<t> 3 , are linearly dependent if one of the func¬ 

tions (j)j(x) = 0 a.e. because 

N 

£ A i<j>i = 0 (2.234) 

i=l 

can have \j<j>j = 0 with A j arbitrary. 

2. If <f>i, (f> 2 , (f> 3 ,...., 4>n are linearly dependent, then one of these functions can 
be expressed as a linear superposition of the others. Proof: 

N 

£ Aj^j = 0 with some A j + 0 
i=1 

^ A j(j)j + £ A i(f>i — 0 => cf)j — — —— £ A i(j>i 
i±j A? i±j 

3. If (f>i, (f> 2 ,4>3, •••., 4>n are mutually orthogonal {{0i \ 4>j) = Ofori + j) and if 
no (f>i is identically zero, then these N functions are linearly independent. 

Proof: Let 

N 

£ X l <j) l = 0 (2.235) 

i= 1 

Therefore, 

( N \ N 

<t>j £Ai^i) = £Aj(^- I <j)i) 
i= 1 / i=l 

= Aj | =► A j = 0 

since </>j is not identically zero. 

Definition 

<j>i, 02 1 <^ 3 i ••••) 4>n are orthonormal (ON) if they are mutually orthogonal and 
normalized: (<j>i\4>j) = Sji. 

Theorem 

We now define the so-called Gram-Schmidt Orthogonalization Procedure. 

Given the linearly independent functions 4>i,4>2 ,. (finite or infinite number 

of functions), one can construct an orthogonal set of non-zero (not identically 
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zero) functions iq, U 2 ,.where vn is a linear superposition of 0 i, 02 ,., 0 at 

( vn does not depend on 0 at+i, 0 iv+ 2 >.)• 

Furthermore, 

gives an ON set of ix 1 , 1 x 2 , 


(v n | v n ) 


(2.236) 


Construction 

1. V\ = </>i (0i is not identically zero). 

2. Let V 2 = 02 + O 2 i 0 i where 021 is determined by requiring that 


(fi | v 2 ) = 0 = (tq | <j> 2 + o 2 i0i) = (vi | 0 2 ) + « 2 i {vi | 0i) 


«21 


(^i I <h) 
(o I 01} 


(2.237) 


(v 2 is not identically zero because 0 = 02 + O 21 U 1 is impossible when <j > 2 
and iq = cf>i are linearly independent). 


3. Let V 3 = <p 3 + 032 U 2 + 031 U 1 where 032 and 031 are determined by requiring 
that 


so that 


and 


so that 


(vi | v 3 ) = 0 = (iq | 0 3 + a 3 2 U 2 + 0 3 iiq) 

= (vi | 03 } + O 32 (iq I v 2 ) +031 (iq I iq) 

=0 *0 


0 = (vi I (j>3) + 031 (iq | 111 ) =► 0 3 i = 


(viUM 

(vi I 0 } 


(v 2 I V 3 ) = 0 = (v 2 I 03 + O 32 U 2 + o 3 iiq) 

= (^2 | 03) + O 32 {v 2 I v 2 ) +031 {v 2 I Hi) 
*0 =0 


0 = (V 2 | 03) + 0 32 (v 2 | V 2 ) => 0 32 = 


(^2 | 0 3 ) 


(n 2 | h 2 ) 

4. The construction of 114 , 115 ,.proceeds in a similar manner. 


(2.238) 


(2.239) 


Note 


-n n , I , 

( V n | fn) 

implies that u n is normalized. 


, I v {v n \ V„) 1 

(Un | !tn) = 7-;-7 = 1 

\V n V„) 


(2.240) 
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Example 


Consider functions of one variable <j> n (x) - x n , n = 0,1,2,.defined on [-1,1] 

so that 

l 

( 0 i I 02 > = / dx (f>l(x)(t> 2 (x) (2.241) 

-l 

These functions are linearly independent by the properties of a power series, 
that is, 

(^) = E a ^ n = 0 (2.242) 

n n 


implies that all a n are zero. If we designate the orthogonal polynomials that 

result from the Gram-Schmidt process, applied to 4> n (x ) = x n , n = 0,1,2,. 

by Pq(x),Pi(x),P 2 (x), . and require that P n (x = +1) = +1 instead of the 

standard normalization given by 

l 

(Pn I Pn) = f dxP n (x)P n (x) = 1 (2.243) 

-1 

then the resulting functions are the standard Legendre polynomials 

Pq{x) - 1 , Pi(x) = x , P 2 (a^) = ^(3cc 2 - 1) , etc (2.244) 

with 

l 

(Pn | P n ) = f dxP n (x)P n ( x) = - 2 — (2.245) 

J 2n + 1 

-l 


Definition 

4>i,<t>2, . is a complete orthonormal (CON) set in L 2 if it is an ON set 

((4>i \<f>j) = djj) and any <j> € L 2 can be expanded in terms of the </> 


oo 

0(5) = y ^*0i(5) 

i=l 


(2.246) 


with convergence a.e. 

The Ai’s can be found explicitly: 


(0j I 0) = 101 

oo oo 

= Z (01 I 00 = E = A l > J = 1)2,3, 

i= 1 i= 1 
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Theorem 

Let <j)\ e L 2 and (j)^ e L 2 so that we can write 


— E A^ i 0A1 — E A** ^ 

i=l z=l 

with .a CON set. Then, 

(<£a I <?V) = E A iMi 


Proof: 


( 0 A I <^ M > = (E A *^ 


*=1 
OO OO 


J =1 




*=1 j=l 


= EEAiMi = EA^ 

i=lj=l i 


(2.247) 


(2.248) 


Note 

oo 

If (f> € L 2 with <f> = Y Awhere {</>j} is a CON set, then an immediate conse- 
2 = 1 

quence of the previous theorem is 

OO oo 

(^|0)=EA*A i = E|A,:| 2 (2.249) 

2=1 2=1 

Now let A be a linear operator (acting on functions in Li 2 ). The eigenvalue 
equation for A in L 2 is A(j>{x) - acj)(x) for all x, where the eigenfunction <f>(x ) is 
normalizable (square-integrable and not identically zero) and where the eigen¬ 
value a is a constant. Since 4>{x) is normalizable, the equation can be satisfied 
for only certain values of a. 

Theorem 

The eigenvalues of a hermitian operator are real. 

Proof: Let A(j> = a<f>. Therefore, 

{<t> I M) = I a<t>) = a{(/) | 4>) 

= (A<j> I 4>) = (a<t) I <t>) 

= a* ((j> | <f) =► a = a* =► a is real 

where we have used A-A^ and ((f> \ <j>) + 0. 

For a given eigenvalue a, there may exist several linearly independent eigenfunc¬ 
tions: A(j>i = acj )i , A(j >2 = a(f >2 , . with linearly independent but 

having the same eigenvalue a. 
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Definition 

The eigenvalue a is 5 -fold degenerate if there exist 5 linearly independent eigen¬ 
functions belonging to the eigenvalue a and any (5 + 1 ) eigenfunctions belonging 
to a are linearly dependent. The eigenvalue a is non-degenerate if there exists 
only one linearly independent eigenfunction belonging to a. 

Some of the eigenvalues of A may be non-degenerate while other eigenvalues of 
A may have differing orders of degeneracy. 

Note that A(f> - a<fi => A{N<j)) = a(Ncj)) for any constant N. However, <f> and N<j> 
are linearly dependent and therefore, the eigenfunction N<j) does not add to the 
degeneracy of the eigenvalue a. 

Theorem 

Let (j>i, (f> 2 , <f> 3 , ....,</>n be eigenfunctions of A belonging to the same eigenvalue 
a. These eigenfunctions need not be linearly independent. Then, any linear 
superposition 

n 

(2.250) 

which is not identically zero is also an eigenfunction belonging to the eigenvalue 
a. 


Proof: 


Acj>i - a(j>i , i = 1,2,3,., n 

( n \ n n / n \ 

Yj = 2 = 2 C * a( ^ = a 2 Cifc 

i= 1 / i =1 2=1 \i= 1 / 

where we have used the linearity of A. 

Theorem 

Let the eigenvalue a be 5 -fold degenerate. Then, there exists 5 orthonormal 
eigenfunctions (u 1 ,u 2 , ....,u g ) all belonging to the eigenvalue a such that any 
eigenfunction (f) belonging to the eigenvalue a can be written 

g 

(j) — ^ ) CiUi , (Ui | Uj ) — Sij (2.251) 

i=l 

that is, {ui,U 2 , .... ,u g } forms a CON set of eigenfunctions for the eigenvalue a. 

Proof: 5 -fold degeneracy implies there exists linearly independent eigenfunc¬ 
tions <j>i, fa, 4>3, ...., (f> g all belonging to the eigenvalue a. One can perform the 
Gram-Schmidt procedure on these eigenfunctions to obtain {ui,it 2 , an 

ON set of (non-zero) eigenfunctions belonging to the eigenvalue a (since u n is a 
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linear superposition of <j>\, 0 2 , 03 , <t> g , u n must be an eigenfunction belonging 

to the eigenvalue a). To show completeness for the eigenfunctions belonging to 
a, let 0 be an arbitrary eigenfunction belonging to a. c/-fold degeneracy implies 


(p+1) eigenfunctions 
^1 ? ^2 1 ••••? 5 0 

must be linearly dependent. Therefore, 

9 

y d iUi + d(j) = 0 (2.252) 

i=l 

with a non-zero d is possible. If d is zero, the all di = 0 by the linear independence 
of {tii, u 2 , u g }. Therefore, 


1 9 

0 = --jY^diUi 
“ i=l 


(2.253) 


Theorem 

Let A be hermitian. The eigenfunctions of A belonging to different eigenvalues 
are orthogonal. 

Proof: Let A<j>i = Oi0i , 7l0 2 = O 20 2 , «i + a 2 - Then 

(<£2 I Mi) = (02 I ai0i) = a l (02 I 01) 

= (A4>2 I 01) = (a 2 02 | 01) 

= a 2 (02 I 01) = 02 (02 | 01) 

(ai - a 2 ) (02 | 0i) = 0 , ai-a 2 *0 

(02 | 01 ) = 0 

Let A be a hermitian operator with eigenvalues 01 , 02 , 03 ,. a n , . 

Let the eigenvalue a n be g„-fold degenerate ( g n = 1 if a n is non-degenerate). 

Let | Un\v!f\ | be a CON set of eigenfunctions for the eigenvalue 

a n . Note that the eigenvalues a n are countable, that is, they do not take on 
a continuum of values. This property (called separability) will be discussed 
further later. Now we have 

Aui a) = a n u ( n a} , a =1,2, (2.254) 

so a labels each member of the set of degenerate functions which belong to the 
eigenvalue a n and n labels the eigenvalue a n . 
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Since eigenfunctions belonging to different eigenvalues are necessarily orthogo¬ 
nal, we have 

R a) \u^) = S nm S a0 (2.255) 

Let | be an ON set of eigenfunctions of A with taking on all possible 

values, that is, with a n going through all eigenvalues of A, and with a go¬ 
ing through a CON set of eigenfunctions for each eigenvalue a n , that is, any 
eigenfunction belonging to a n can be expressed as a linear superposition of 

^Un\un\ . ,Un n ^ J. We will call such a set a maximal ON set of 

eigenfunctions of A. 

contains the following eigenfunctions: 

u^\u^\ . ,u[ 91 ^ = CON set for eigenvalue ai ( 31 -fold degenerate) 

u^\u^\ . ,u^ = CON set for eigenvalue 02 ( 52 -fold degenerate) 


(going through all eigenvalues of A) 

The set may or may not be a complete orthonormal set for all square- 

integrable functions, that is, an arbitrary square-integrable function <fi(x ) may 
or may not be expressible in the form 

(2.256) 

n a=l 

Example 

Consider 

A = Pxo P = -faT. (2.257) 

As we showed earlier, there are no normalizable eigenfunctions of p XO p- Thus, 
{ul a) } contains no functions - it is not a CON set in L 2 . 

Example 

Consider 

A = L zop = — —— (2.258) 

1 aq> 

As we showed earlier, the eigenfunctions of L zop are f(r, 6)e lt ' ^ with l - 0, ±1, ±2,.... 
and 

OO 7T 27T 

J r 2 dr J sin 9dO J 9)e U4>] \ (2.259) 

0 0 0 
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is finite and non-zero. Let {/ Q (r, #)} a = 1 , 2 ,.... be any CON set for square-integrable 
functions of r and 9 so that 

00 7T 

J r 2 dr J s'mOdO f*(r,0)fp(r,6) = 6 af3 (2.260) 

o o 

and 

= (2.261) 

a 

for any square-integrable function f(r,9). Now 

4 a) (2) = 4r/«M)e^ (2.262) 

v 27 T 

is an eigenfunction of L zop with eigenvalue Ih for any a. 

Since u^\u^\ . are linearly independent, the eigenvalues have infinite 

order (all r and 9 values) of degeneracy. We also have 

OO 7T 2tT 

(u[ a) | ) | = J r 2 dr J sin 9d0 f*(r, 9)fp(r, 9) J = 5 aa '5w 

oo o 

(2.263) 

Let $(5) be an arbitrary square-integrable function. The we have 

*(r,M)= E Hr,0)^ (2.264) 

£=—oo V ""TT 

which is just the Fourier series expansion of a function of <!>(?’, 9fixed) in the 
interval [0, 27 t] . But, we can always write 

h{r,6) = E4 a) /«( r ’ 0 ) (2.265) 

a. 

since {/ a (r, 9)} a=1 2 is a CON set. Therefore, 

E E4“ ) /o( r ’ 6, )-7j= = E E4 Q) 4 Q) ( S ) (2.266) 

£=—oo a v £=—oo a 

that is, ju^(;r)j is a CON set in L 2 . 

Let A op = A(x,p 0 p) be the hermitian operator corresponding to some physical 
quantity A = A(x,p). Then, as we have stated earlier, a measurement of A 
performed at time f 0 will yield the value a with certainty if = ^o(^) is 

a (normalizable) eigenfunction of A op with eigenvalue a. 

Suppose ip(x,to) - V'o(S) is normalizable but it is not an eigenfunction of A. 
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Then, as we stated earlier, a measurement of A at time to will necessarily have 
AA + 0. 


We first consider physical quantities A such that the maximal set of ON eigen¬ 
functions j of A op is complete in L 2 . 


Therefore, 


0o(z) = EEc< a) i(i B) (i) 


(2.267) 


a.e. We then have 

(«m | 0 O } = (u, (/3) 


EEc\ 




= EE4“ ) (^ ) K a) ) 


or 


Also, 


= YYc {a) 6 r6 = c ( ^ } 

/ , / , u otp u nm 
n a 

= | 0 o} = expansion coefficients 


(2.268) 


< 0 o| 0 o> = (eE4» 

\ m 8 


EE^^ 


K a) > 


C'n c rh \ u m u n 


= EEEE' 

m (3 n a. 

-EEEE<4“ ) 4f ) *««'A». = ESK“ ) l 


(2.269) 


m b n ot 


Let us now calculate {A N ), the so-called moment of the probability distri¬ 
bution of measurements of ^4 (TV = 01, 2,.). We have 


<^) = 


1 / 


(0o | ^0o) = __^_lyy 

(0o|0o) (0o|0o)\m£ 


r (/3),,(/3) 

“'m 


i£EE4 a) 4 a) 


y y 

(00 I 0o) \ m 0 


(0o I 0o) ( 
1 


EE 

m ft 


J.P),XP) 


r (/3),„(/3) 

Lh m 


/ v / v °n ^op 11 n 


( a '>(nO N oX a '> 


EE4 Q) (a„) 


(00 I 0 O> rn 0 


EE(»!? I«?>)-EEkSr’l («„)" 


Therefore, the lV tft moment of the probability distribution of measurements of 
A is 


<^> = 


1 


(0o I 0o) 


EE|4 q) | (a n ) A, = E(«n) 


N 


9n \C. 


(«) I 


E 


y (00 10 o> 


(2.270) 
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Let p(a,<o) be the probability that a measurement of A will yield the value a 
at time to- Therefore, from our earlier definition, 

(A n )= £ a N p(a,to) (2.271) 

all measured 
values of a 


As we showed earlier, the (A N ) for N = 0,1,2,. uniquely determine the 

probability distribution of measurements of A. Therefore, comparing (calcula¬ 
tion versus definition) the above expressions for (A N 'j we have a most important 
result 


p(a,t 0 ) 


(ygn |c ( „ a) l 2 

I *-a=l (^olV’o) 

lo 


for a-a n [an eigenvalue of A op ] 
for a not an eigenvalue of A op 


(2.272) 


Any measurement of A will yield a value which must be an eigenvalue of A op . 
The probability of observing a given eigenvalue is given by the above expression 
(2.272). 


Now 


<i a) = K a) , 4 a) * = (^K a) ) 


Therefore, 


p (u, t o) 


ygn (^° ^o) 

*- 01=1 {'4’oH’o) 


0 


for a-a n [an eigenvalue of A op J 
for a not an eigenvalue of A op 


(2.273) 


(2.274) 


Thus, the only possible results of a measurement of some operator are its eigen¬ 
values! 


Notes 


1. Let ip(x,to) = V’o(S) be an eigenfunction of A op with eigenvalue a r so that 
A op rp o(x) = a r ipo(x). We should find that a measurement of A will yield 

the result a r with certainty. Let us check this. ., is a 

CON set of eigenfunctions for the eigenvalue a r . Therefore, 

Mx) = £ d a ui a) (2.275) 

a=l 

that is, only eigenfunctions for a r occur in the expansion. Therefore, 

<V’olV’o>=£K | 2 (2-276) 

CK=1 


and 


p(Mo) 


[s^9n \dg\ 2 _ 1 

I *-a=l (V’olV’o) 

[o 


for a = a r 
for a + a r 


(2.277) 
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2. If the eigenvalues are non-degenerate, then a = 1 is the only value and we 
need not write a at all 

0o (x) = Y, c nU n (x) , A op u n (x) = a n u n (x) 

n 

Cn = i^n | V’o) ? (^n | ^m) = $nm 
(■00 I 0o> = El C «| 2 

n 

p(o, to) = { for a “ (2.278) 

(0 for a + eigenvalue of A op 

3. A required condition on p(a,t 0 ) is 

E p(Mo) = l (2.279) 

all values 
of a 

Let us check this. Recall that 

(0o|0o} = El c i a) f (2.280) 

n,ot 


so that 


1 


E p(Mo)E u [77 V E 

all values n \W | ^0/ a =l 

of a 


~ ci «)| 2 


_ 1 _y U q )| 2 = i 

( 0 o| 0 o)^J " 1 


as required. 


Example 

Let A op = L zop . We have shown that the eigenvalues are £h ((, = 0,±1,...) and 
that the maximal ON set of eigenfunctions ju^(5) j is complete in L 2 . Thus, a 

measurement of L z at any time will yield a value which must be 0, ±h, ±2 h, . 

L z is said to be quantized. If 0(5, to) is not an eigenfunction, then there will 
be a spread in possible values of L z and the previous discussion tells us how to 
calculate p((h,to). 


2.2. Energy Measurements 

Let us now turn our attention to energy measurements. We must therefore con¬ 
sider the eigenvalue equation for energy (called the time-independent Schrodinger 
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equation). Assume that the particle has conservative forces acting on it so that 
F = -VV(i) where V(x) is the potential energy. Then 



(2.281) 



(2.282) 


We must solve the eigenvalue equation 



2 m 


(2.283) 


The eigenvalue E is a constant (independent of x). Only for certain values of 
E will the equation have a solution </>(i) which is normalizable. 

These normalizable eigenfunctions of F[ op are said to represent bound states. 
If i/j(x,t 0 ) is a normalizable eigenfunction of H op then it has a precise energy 
(A E = 0) and 



(2.284) 


which is required for normalizability. Thus, the probability of observing the 
particle at a certain position is confined (non-negligible probability) to some 
finite (bounded) region of space - the particle has definite energy and is bound 
in this region. 

As an example, consider the classical motion of a charge (-q) about a fixed 
charge (+q) as shown in Figure 2.10 below. 



bound orbit 
(E<0) 


Figure 2.10: Classical Orbits 


Quantum mechanically, the orbit trajectories are not well-defined (even though 
the energies are precise). However, one would expect the classical bound orbit 
to correspond to a normalizable quantum eigenfunction (p(5,t 0 ) non-zero for 
r ->■ oo). The non-normalizable eigenfunctions are therefore physically interest¬ 
ing and should play a role in the theory. Indeed, we expect that the normalizable 
eigenfunctions of H op for the above charge will not be complete in L 2 because 
unbound motion is also possible. We will postpone discussing non-normalizable 
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eigenfunctions of H op and concentrate on the normalizable eigenfunctions. Re¬ 
member only normalizable wave functions have been incorporated in our theory 
so far. We expect that the normalizable eigenfunctions of H op will be complete 
in L 2 if there are no unbound orbits possible for the classical H. 


Note 

For the differential equation 
h 2 

-V 2 (^(S) + V(x)4>(x) = Ecf>(x) (2.285) 

2m 

if V ( x ) is finite in some region (V(x) may, however, contain discontinuities in 
this region), then 4>{x) and d(f>(x)/dxi must be continuous in this region (so 
that V 2 (^(5) is finite, as required by the above equation for finite V(x). 


Example 


We consider a particle (mass m) constrained to move along a frictionless hor¬ 
izontal wire (dashed line in top part of Figure 2.11 below) between two rigid 
(impenetrable) walls. This is one-dimensional motion. We then have 


with 

and 


H = ^- + V(x) , = 

Lm dx 


V(x) = 0 for x e (0, L ) so that F x - 0 for x e (0, L ) 


(2.286) 


V(x) = Vo 00 for x < 0 and for x > L 
as shown in the lower part of Figure 2.11 below. 




4 V(x) 

i 

i 

' v 0 

(b) 



o 


L 


x 


Figure 2.11: Particle in a Rigid Box 
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We have F x = -oo at x = L and F*. = +oo at a; = 0. Therefore, the (classical) 
particle cannot get into the regions x < 0 and x > L no matter how energetic it 
is (the kinetic energy p 2 /2m would have to be negative to obtain a finite total 
energy). 

Classical Motion: The particle moves back and forth between the two walls 
with constant speed. The reflections at each wall just reverses the particle’s 
direction of motion. E c i ass i ca i = mv 2 /2 can take on any value from 0 to oo. All 
motion is bound between 0 and L. 

The eigenvalue equation for energy (time-independent Schrodinger equation) is 

H op 4>(x) = E(j>{x) => - ^ d ^ ^ + V ( x)(j>{x ) = E<f>(x) (2.287) 

with E = constant and (f>{x) normalizable. 

Note: Because Vo -*■ oo, it is not clear whether or not and d(f)(x)/dx are 
continuous at x - 0, L (</>(S) and d(j)(x)/dx must be continuous in regions where 
V(x) is finite). 

Let us look at this issue of boundary conditions in two ways, first for this specific 
problem and then in general using the Schrodinger equation (this last approach 
will then apply to all potential energy functions) 

Boundary Condition for this Particular System 
Claims: 

1. <f>(x) = 0 for x < 0 (region (a)) and for x > L (region (c)). 

2. (j>{x in region (b) (0 < x < L) goes to zero as x -» 0 and as x -» L so that 
<f>(x is continuous for all x. 

Conditions (1) and (2) will determine <j>(x) and we will then see that d(j)(x)/dx 
is discontinuous at x = 0 and at x = L. 


Proof of conditions (1) and (2) 

Let Vo be finite. Then (j>(x and dcf>(x)/dx are are necessarily continuous every¬ 
where - in particular at x = 0 and x = L. After obtaining the conditions on (j){x 
for finite Vq we now let Vq -»■ oo. 


Let E < Vq (this is no restriction since Vq -*■ oo). 


Region (a): 


h 2 d 2 </) a 
2m dx 2 


+v 0 r 


E(j) a 


(2.288) 
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Region (b): 


h 2 d 2 (j) b 
2m dx 2 

ft 2 d 2 </> c 


Region (c): 


= 


2m dx 2 


+ 


(2.289) 


(2.290) 


where is the same in regions (a), (b), and (c) because the eigenvalue is inde¬ 
pendent of x. 


Now, let 


k = 


\/2mE 


K = 


y/2m(V 0 - E) 


ft ’ ft 

Note that K is a positive real number for E < Vq. Therefore, 


Region (a): 


2 xa 


d 2 4>‘ 


■ K 2 d> a = 0 


Kx . q-Kx 


= ae* x +pe 


dx 2 

Since x < 0 the term e~ Kx diverges as x -*■ -oo. 
Thus, <j> is normalizable only if (j = 0. 
Therefore, 4> a = ae Kx . 


Region (b): 


Region (c): 


d 2 (t> b 

dx 2 

d 2 <p c 

dx 2 


= -k 2 ^ -*(j> 0 = A sin kx + B cos kx 


■ K 2 d> c = 0 


= ^e Kx + Se 


-Kx 


Since x > L the term e~ Kx diverges as x -*■ +oo. 
Thus, <f> is normalizable only if 7 = 0. 

Therefore, (j> c = Se~ Kx . 

We note that a, A, B, 5 may depend on Vo- 

For finite Vo, 4>(x) and d(f>(x)/dx are continuous at x = 0 and x = L. 
x - 0 : 

0 a (O) = 4> b (0)=>a = P 


dcj) a ( 0 ) # 6 ( 0 ) 


dx 


dx 


aK = Ak 


(2.291) 


(2.292) 


(2.293) 

(2.294) 
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x = L: 


-KL 


(j) c {L) = cj) (L) =>• A sin kL + B cos kL = 6e 

d0_(O) = d(j> (0) ^ Ak cQg kL _ m gin kL = _ KSc -kl 
dx dx 


The two conditions at x = 0 give 


K = k— 
B 


(2.295) 


The two conditions at x - L give 


-K = 


k(A cos kL - B sin kL) 
A sin kL + B cos kL 


(2.296) 


Now let Vo -»■ oo. This corresponds to K -» 00 while k, A and B stay finite. 
Note also that 4> b = A sin kx + B cos kx must stay continuous with a continuous 
derivative as Vq 00 because 


h 2 d 2 (j) b 
2 to dx 2 


E(j) b 


(2.297) 


in region (b) for any Vo- Now 


K = k — = 00 with kA finite =► B -> 0 
B 

k(A cos kL - B sin kL) 

—K = —-- 

A sin kL + B cos kL 

k cos kL . . 

= -00 = — ; - with B = (J => sin kL -* 0 

sin kL 

Therefore, 

(j) b = Asinkx with sin kL -*■ 0 (2.298) 

so that at (j) a -* 0 x = 0 and x - L when Vq 00 . 


Furthermore, <j> a = ae Kx -*■ 0 as K -»■ 00 (x > L) because A sin kL + B cos kL = 
Se~ KL -* 0 requires 5e~ KL -* 0 for x > L. 


This completes the proof of claims (1) and (2). Notice that (j> b { 0) = 4> b {L) = 0 
with 4> a (x) = 4> c (x) = 0 means that 4>{x) is continuous at x = 0 and at x = L as 
Vo -*■ 00 . 


General Discussion of Boundary Conditions 

The Schrodinger equation in 1-dimension is 

_ ft d tM-c) +v(x)il> E (x) = EiPe(x) (2.299) 


107 



The solutions ip E (x) are the energy eigenstates (eigenfunctions). We are thus 
faced with solving an ordinary differential equation with boundary conditions. 

Since ipE(x) is physically related to a probability amplitude and hence to a 
measurable probability, we assume that i/je(x) is continuous since a measurable 
probability should not be discontinuous. This is the way physics is supposed to 
interact with the mathematics. 


Using this fact, we can determine the general continuity properties of dif>E(x)/dx. 
The continuity property at a particular point, say x = Xq, is derived as follows. 
Integrating the equation across the point x = Xq we get 


0,0-r 

/ 


d 2 il> E (x) ^ _ X f e j ( dip E (x )' 


dx 2 




x 0 -e 

2 m 

= ~H 2 


dx 

Xq+£ 


Xq + £ 


E J i/jE(x)dx - J~ V(x)ipE(x)da 


Taking the limit as e -» 0 we have 


lim 


dip E (x) 


-o l dx 


X=Xo+£ 

2 m 


dip E (x) 

dx 

Xq+£ 


Xq+£ 


iflirn / ipE(x)dx -lim / V{x)ip E {x)da. 
£—>0 J £-*-0 J 


Xq~£ Xq~£ 

or the discontinuity in the derivative is given by 

Xq + £ 


A 


(df/E^r)\ = 2m ^ j v ^ x ^ E ^ dx (2.300) 


where we have used the continuity of 'iPe(x) to set 


Xq+£ 


lim / i/j E (x)dx = 0 
£-*•0 J 


(2.301) 


This makes it clear that whether or not dip E (x)/dx has a discontinuity at some 
point depends directly on the potential energy function at that point. 


If V(x) is continuous at x = Xq (harmonic oscillator example), i.e., if 


then 


lim [V (xq + e)-V (xq - e)] = 0 
£—►0 


/ v Xo+£ 

= f v ( x )i’E(x)dx = 0 (2.303) 


(2.302) 


A 
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and dipE(x)/dx is continuous. 


Finally, if V(x) has an infinite jump at x = Xo (infinite square well and delta- 
function examples), then we have two choices 

1. if the potential is infinite over an extended range of x (the infinite well), 
then we must force i/>e(x ) = 0 in that region and use only the continuity 
of iP e(x) as a boundary condition at the edge of the region. 

2. if the potential is infinite at a single point, i.e., V(x) = 5(x - Xq ) then 


' x 0 -e 

2m X ° +£ 

= —lim / S(x- x 0 )'i/j E (x)dx 
n z £—>o J 

X 0 ~£ 

= jt lirntM^o) = TT^Eix 0 ) (2.304) 

n z e->o n z 

and thus dijjE(x)/dx is discontinuous by an amount proportional to the 
value of the wave function at that point. 

These general results work for all potential energy functions, as we shall see. 


One-Dimensional Infinite Square Well 

Now let us start from the beginning and solve the Schrodinger equation for 
Vo = °o, the so-called one-dimensional infinite square well , with the following 
given: 


1. for 0 < x < L 


h 2 d 2 (j) 
2m dx 2 


E<f> 


2. for x > L and x < 0 

(j> = 0 

3. (f is continuous for all x\ dcf>(x)/dx need not be continuous for Vo = oo. 
(2) and (3) are the conditions we just proved to be required when Vo = oo. 

For 0 < x < L, we have 


d 2 <t> ;2 , ,2 2mE 

_ = , k 


so that 


<f> = A sin kx + B cos kx 


(2.305) 

(2.306) 
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Then 

</>(0) = 0^B = 0^cj) = A sin kx 
4>(L) = 0 => A sin kL - 0 

Assuming that A + 0 (or the solution would be identically zero everywhere) we 
have the condition 


sin/cL = 0 =► k n L = nn , n = 1,2,3,.... (2.307) 

Note that the case n = 0 => fc = 0 => 0 = 0 and is not normalizable and the case n 
negative implies k negative. With <j> = A sin kx, positive and negative k do not 
lead to independent solutions. Therefore, we need only consider positive k. 


We have 


t 2 2 mE n n 2 n 2 

kn = ~H 2 ~ = 


2mL 2 


, n = 1,2,3,.... 


(2.308) 


These are the energy eigenvalues. The corresponding energy eigenfunctions are 


nn 


4> n (x) = A n sin k n x = A n sin —x 

L 


(2.309) 


Notice that each energy eigenvalue is non-degenerate 


. nn . n' n 

sin —x and sin- x 

L L 


(2.310) 


are linearly independent for n + n'). 


The energy levels are quantized, that is, they take on only certain discrete values. 
This is to be contrasted with the classical situation, in which the particle can 
have any energy in the range [0,oo). 


The ground-state energy 


^ n 2 h 2 
El = “ ^77 * 0 


(2.311) 


2 mL 2 

This is consistent with the uncertainty principle. The localization of the particle 
within a range 


h (A p x ) 2 h 2 

A Px > - => E « v F J ^ 

L 2 m 2mL 2 


Err 


2mL 2 

The spacing between low-lying energy levels is of the order h 2 /mL 2 . 
1. Macroscopic object 


(2.312) 


m a 1 gm , L « 10 cm or 


mL 2 


10 56 erg 


(2.313) 
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2. Electron confined to an atomic distance 


m i 


10 2 ‘ gm , L rs 1 A = 10 8 cm or —— ~ 10 11 erg « 10 eV (2.314) 

mL 2 


3. Proton confined to an nuclear distance 


m Ri 10 24 gm , L rs 10 iz or 10 “cm or 


13, 


mL 2 


10 -b erg 1 MeV (2.315) 


Note: An electron-volt (eV) is the energy obtained by an electron when it is 
accelerated through a potential difference of one volt: 

leV = (1.60 x 10- 19 coul)(1 volt) = 1.60 x 10~ 19 joule = .60 x 10' 12 er ff (2.316) 

and 

1 MeV = l0 6 eV (2.317) 

Also, visible light typically has v » 10 15 sec' 1 so that 

Ei photon = hv « 10 _12 erg RileV (2.318) 

Let us check that the eigenfunctions corresponding to different eigenvalues are 
orthogonal. We have 


,. i . . r i a * ■ nnx a ■ 

(pn I <pt) = J dxA n sin —j~ A e sm —j~ 


KM 
(2 i) 


iwf dx(t 


A 


A'A, 


V / dx ( e ' 


(n+£)n X . (n+t)-n 

l +e L 


An-t)* 


: « 


KM 


r ( ((n + f)TTx\ ((n-E)Trx\\ 

J dl ( cos (——J) 


so that 


4* A 


n ' 


L . / 2mtx \ 

-sm - 

2n7r \ L ) 


- L t' L ) 


= -L 


(2.319) 


and 


KM 


r ( ((n +£)ttx\ i(n-£) ttx\\ 

J —z—J' cos v—z—;J 

L . ( (n + £)irx \] L 

S11, (-^-jj„ 


(n + £) 


(n - i) 


L . ( (n - £)ttx\] L 

S,1 H—^—iJo 


= 0 , n + 


(2.320) 
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or 


(2.321) 


Letting 


/ ; O 

\$n | 0^) = l^dnl 



where {u n } is an ON set of eigenfunctions, we have 


(2.322) 


u n {x) 



i) = 5„ 


(2.323) 


We note that du n /dx is discontinuous at x = 0 and x = L. Some plots of the 
first three wave functions (n = 1,2,3) are shown in Figure 2.12 below: 








Figure 2.12: n = 1,2,3 wave functions 


Since the eigenfunctions vanish for x < 0 and x > L, the probability of find¬ 
ing the particle (whose wave function is an energy eigenfunction) outside the 
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region [0,L] is zero. For a particle whose wave function is u n (x), the proba¬ 
bility of finding the particle in [a;, x + dx ] is not uniform over the region [0, L\ 
since p(x,t) = |u n (x)| . This should be contrasted with the classical motion at 
constant speed - classically the particle spends equal amounts of time in each 
interval [x,x + dx] throughout the region [0,L]. 


Consider the ON set of all eigenfunctions {«„}. Is this complete in L 2 ? Because 
all the eigenfunctions u n (x) vanish for x < 0 and x > L, a square-integrable func¬ 
tion which is non-zero outside [0, L] cannot be expanded in terms of the u n (x). 
However, if il>o(x) is square-integrable and non-zero outside [0,L], then 


= (V'o I H M 

[ (V’olV’o) 


1 

O’o I V'o) 


oo 

h 2 d 2 

J dx'ipQ (x) 

-oo 



l/j Q (x) 


(2.324) 


would be infinite because V(x) = Vq = oo outside [0,L]. Thus, a necessary 
condition for ( H } to be finite is for ipo(x) to vanish outside [0,L]. 


It is true, however, that the set 



(2.325) 


is complete for square-integrable functions which vanish outside [0,T]. The 
reasoning goes as follows. We know that 



n=0,±l,±2,... 


(2.326) 


is complete on [-L,L] (just Fourier transform stuff). We have 


. nnx ivnx . . 7r nx 

e L = cos- h i sin- 

L L 


tt nx . mix 

cos-, sin- 

L L 


for n = 0,1, 2,.... is complete on [~L,L\. 


(2.327) 


Let f(x) be an arbitrary function on [0,L]. Let 


= for x e [0, /] 

S \-/(a;) for x € \-L, 0] 


(2.328) 


g(x) can be expanded in terms of 


mix . mix ) 

cos-,sm-> 

L L J 


(2.329) 


that is, 


OO 

g(x) = X! A n cos 


n=0 


7T nx 


+ 


oo 

Y, B n sin 

71=1 


7T nx 
~L~ 


(2.330) 
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But, 


. r 7r nx 

A n oc j dxcos — —g(x) = 0 

-L 


(2.331) 


because g(x) is an odd function of x. Thus, g(x) can be expanded in terms of 


. tttix} 

in - > 

L I 


sin 


(2.332) 


But this expansion must be valid in the subinterval [0,L], where g(x) = /( x). 
Therefore, 

( IT 71 T* 'l 

(2.333) 


. Trnx) 
sm- > 

L J 


is complete on [0,T]. 


Note: a.e. on [0,T] 

°° ITT) T" 

f(x) = Y, B n sin —— (2.334) 

n=l ^ 

For example, if /(0) + 0, then the endpoint x = 0 must be one of the points at 
which the expansion doesn’t work (because each simrnx/L = 0 at x = 0). It is 
of no consequence, however, that the expansion does not hold on such a set of 
measure zero! 


Let ip(x,to) = i/ , o(a;) be an arbitrary wave function describing the particle con¬ 
strained to move along a frictionless horizontal wire between two rigid walls. 
Assume that i/>o(x) - 0 for x < 0 and x > L. Then we have 


oo 

fpo(x) = Y, c n‘ u n(x ) 
n= 1 


u n (x) 



(non - degenerate) 


E n = n‘ 


7 T 2 h 2 
2 mi 2 


(u n | ) ~ $n 


which gives expansion coefficients 


— 



(2.335) 


and 

I I 2 

p(E n ,t 0 )= ' 7 . . (2.336) 

(Wo I V’o) 

is the probability that an energy measurement at to will yield the value E n . 
Only eigenvalues of H op can be obtained when an energy measurement is made. 
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Figure 2.13: Wave Function 


Example 

Consider the wave function shown in Figure 2.13 above. We have 

, fl/'/Z for 0 < cc < L 
ipo(x) = •{ 

0 otherwise 


(2.337) 


A particle described by such a wave function can be found anywhere between 
0 and L with equal probability. Note that ipo{x) is discontinuous at x = 0 and 
x = L. That is fine since we only require that the energy eigenfunctions u n (x ) 
have to be continuous! 


We then have 


(0o I '0o) 


/ 


dx-^=-J= = 1 


(2.338) 


or it is normalized. Then using 


oo 

V’o(z) = Y, C nUn(x ) 
n=l 


(2.339) 


we have 


C-n ~ 



L 

0 


\/2 r n 

= - [- COS717T + 1J 

717T 



for n even 
for n odd 


(2.340) 


Therefore 


p(7f n , to) 



(00 I 00 ) 



^ n 2 7r 2 
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for n even 
for n odd 


(2.341) 










which is the probability of measuring the energy eigenvalue Ejq when we are 
in the state described by ipo(x) at to- Note that it is essentially the absolute 
square of the expansion coefficient | c„ | 2 . 


Note: Since 


we have 


£ p(E n ,to ) = 1 


all n 


£ 


n=l,“5,... n2?r2 


= 1 : 


y, 1 _ 7T 2 


n= 1,3,5,.... 


which is a rather interesting mathematical result. 


(2.342) 


(2.343) 


Suppose that we want to calculate (H) for the above state. We cannot use 


(H) 


(tpo | Hi!) q) 

(V'o I V’o) 


1 

(V’o I i>o) 


oo 

r ^ x 

K 2 d 2 

J dxip 0 (x) 

-oo 

r^d^ +v(x \ 


i>o(x) 


because ipo( x ) is discontinuous at x = 0 and x = L and therefore d 2 ip 0 (x)/dx 2 
is not defined at these points. These two isolated points cannot be omitted 
from the integral because d 2 ifo(x)/dx 2 is infinite at these points and can give a 
non-zero contribution to the integral. The correct way to proceed in this case 
is to use the definition of the average value: 


(H) = £ p(E n ,t 0 )E n 

all n 


£ 


8 n 2 n 2 h 2 
n 2 7r 2 2mL 2 


oo 


(2.344) 


We must now consider those physical quantities whose maximal set of ON eigen¬ 
functions j is not complete in L 2 . An arbitrary square-integrable wave 

function cannot be expanded in terms of the j and therefore, the results 
of previous discussions cannot be used to calculate the probability distribution 
of measurements. 


Let A be the hermitian operator corresponding to some physical quantity. A 
must satisfy some additional, technical assumptions for the following discussion 
to be valid. We will not go into such technicalities here. 

Let us generalize the eigenvalue equation for A: 

A(f(x) = acf(x) (2.345) 


where we now require 

1 . 4>{x) is not identically zero 
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2 . 


(4>\ip) = J d 3 xcj)* (x)ip(x) 

is finite for almost every square-integrable function V’(i). 

Note: If <f>{x) is normalizable, then (4>\ip) is finite for every square-integrable 
function ip(x). Thus, our generalized eigenvalue equation has all the normaliz¬ 
able eigenfunctions we considered before. However, there exist non-normalizable 
functions obeying condition (2). For example, f(x ) = e lk ' x has: 

</!/> = f d 3 x |e* 2 | 2 = oo 

and 

(/ |'0} = J d 3 xe~ ik ^(x) 

exists a.e. by the Fourier transform theorem. 

We have thus enlarged the space of functions (called a rigged Hilbert space) to 
which an eigenfunction can belong. However, the wave function for a particle 
must still be normalizable. 


Note: <j){x ) does not obey condition (2) if ]</>(;?) | oo as |a;| -» oo or \y\ -> oo or 
\z\ -*■ oo. For simplicity, let us demonstrate this for a function of one variable 
< f>(x ). There are square-integrable functions V , ( a; ) which go as \/x as |x| -» oo 
and 

OO 

J dx\tp(x )\ 2 

— oo 

converges at the limits of integration 



1 

x 


0 


For such functions, 

( 4 > | ip) = J d 3 x (j)* (x)i/j(x) 
does not converge at the limits of integration, 


oo ^ 

/ dx~4>* -»■ (oo) x (fnoo) oo 

J X 


(2.346) 

(2.347) 


The spectrum of A is the set of all eigenvalues of A , where the eigenfunctions 
must obey conditions (1) and (2). The following results are from the spectral 
theory of self-adjoint operators discussed in Chapter 4 of this book. Some of the 
results we have already proved, others require detailed mathematical discussions 
because the inner product of two eigenfunctions, each obeying condition (2), 
may not exist, that is, (tpi \(f> 2 ) does not necessarily exist for non-normalizable 
(f> i and <f> 2 . 
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Discrete Part of the Spectrum 

In this case, the corresponding eigenfunctions are normalizable. Now 


= a„</4 a) (x) 


(2.348) 


where a labels linearly independent eigenfunctions having the same eigenvalue. 

The eigenvalues a n are countable (n = 1,2,3,.), that is, the eigenvalues do 

not vary over a continuous range of values. This is the separability property of 

L 2 . 


1. Each a n is real. 

2. Eigenfunctions belonging to different eigenvalues are orthogonal. 

3. a n may be degenerate. g„-fold degeneracy implies one can construct g n 

orthogonal eigenfunctions belonging to a n , (</>n\ ..., such that any 

eigenfunction corresponding to the eigenvalue a n can be expressed as a 
linear superposition of the (c/>n\ ..., 4>n^)- 

4. | \ which is a finite, non-zero constant (this follows from 

the normalizability of the (pn^. Therefore, 


Let 


(2.349) 


,(«) _ 




n: 


(a) 


(2.350) 


Therefore, ( 11 ^ | j = S mn S a /s, that is, the are ON eigenfunc¬ 

tions (normalizable) and we have 


Au L q) = 


(2.351) 


Continuous Part of the Spectrum 

In this case, the corresponding eigenfunctions are not normalizable, but obey 
the conditions (1) and (2) stated earlier. Now 

A^(x)=a c ^(x) (2.352) 

where the subscript c stands for continuum, a labels linearly independent eigen¬ 
functions having the same eigenvalue. The eigenvalues a cv vary over a continu¬ 
ous range of values {v s a continuous variable). 

1. Each a c „ is real. 

2. Eigenfunctions belonging to different eigenvalues are orthogonal. 
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3. a cu may be degenerate. g„-fold degeneracy implies one can construct g u 
orthogonal eigenfunctions belonging to a cv , (<pin ■ ■ ■ •, fy'cv ^), such that any 
eigenfunction corresponding to the eigenvalue a c „ can be expressed as a 
linear superposition of the (<R\ ..., 

4. | = 00 because <j)^ is not normalizable. We therefore expect 

U<$ I 4$) to be proportional to the Dirac delta function S(g- v)5 a p. 



I 0 for g, + v, a + [3 

1 00 for g = v , anda = [3 

(2.353) 

Let 

1 a(“) 


»,(») - _ 

(2.354) 

U Cfl 

\ 

/ f \ 

/R a) 

Therefore, [u$ | = <5(/z- jz)<5 0 

( /3 where = a C(l ii$. 



5. Continuum eigenfunctions are orthogonal to discrete eigenfunctions 

R a) | u -) = 0 (2.355) 


Theorem(proof omitted) 

Let {R 0 . | be a maximal ON set of discrete and continuum eigenfunctions 

of the hermitian operator A. Here, maximal means that the set contains eigen¬ 
functions for all eigenvalues with a. and (3 going through their full set of values for 
each eigenvalue (a = 1 for a n and (3 = l,...,g v for a cv ). Then 
is complete in L 2 , that is, any square-integrable "0(i) can be written: 

V>(5) = E E c n Q) «n Q) (5) + f dv E di 0) u ( Jp(x) (2.356) 

n 01=1 entire ^ =1 

continuum 

Note: The continuum eigenvalues may require labeling by more than one con¬ 
tinuous index v = (vi, 1 / 2 , ■■■■)■ In such a case, dv = dv\ dv 2 . and 


5(v-g) = 6(vi - h\)8(v 2 -/z 2 )- 


(2.357) 


One can easily solve for the coefficients in the expansion. 

an . r a 

E 

n cz =1 

9n , , r 9; 


<«&’ I *) - E S 4“> (<*&> I ')* [ dv±df >{ 

• ^ 6=1 

entire r 

continuum 

= E E c ( n a) S mn 6 ia + [ d^E ^ (0) 


,( 7 ) I „(/3)| 


= C £ ) 


(2.358) 


n a=l 


entire 

continuum 


0=1 
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which is finite by earlier condition (2) and 


(«$? Id = E s 4r» (»£? K">) f M? I «S?) 

" a=1 entire 

continuum 

= IE4 a) ( 0)+ f dvf j d^5^-v)8,^d^ (2.359) 

n a=1 entire ^ =1 

continuum 

which is finite by earlier condition ( 2 ). 

Note: The wave function for a particle must be normalizable (so that a prob¬ 
ability interpretation is possible). However, the CON set j, in terms 

of which the wave function can be expanded, may contain non-normalizable 
eigenfunctions. Let ip(x,to) be the wave function for the particle at to- 

ip(x,to) = u^\x) is possible ( u^ normalizable) 
ip(x,to) = u^\x) is not allowed ( u not normalizable) 

However, 

/ i \i / 2 r 

= ( —j J dvu^{x) (2.360) 

A w 

region 

with a fixed and Av some region in the continuum, is a normalizable wave 
function for any A v t 0 (no matter how small but non-zero). Then 

I V’) = (^) f dv J dv' | 

Aw Aw 

•{ii) I d,/ 1 

Aw Aw 

Aw 

which is thus normalizable. The spread of values produces a normalizable wave 
function. 

Basic Problem 

Suppose we are given a physical quantity A (hermitian). We then solve the 
eigenvalue equation as described earlier and form the CON set of eigenfunctions 
Let ip(x,to) - 'f’o(x) be the wave function for the particle at time 
to- A measurement of A is made at time to- 

What values can be obtained from the measurement and with what probability? 
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We have 

V’o (x)= J^ c ^ a) ui a) (x)+ f du^di^u^ix) (2.361) 

n,a J p 

so that 


1 E ci a) ui a \x) 


^ E ci a) ui a) (x) \, 

^’° 1 A ^ = \ +/ disZd^uiPix) 

A n 

n,a \ 

+ f dvY.di^u£\x) / 



/? / 

1 E ci a) ui a) (x) 

E 

(ck) TV (ck) / -\ 

Cn A n % (#) \ 

\ + f dv'Zdi^Uw\x) 

+ / dvY J di^a^ v uil\x) 1 

0 

1 

0 


= e < i 4 “ ) i 2 + [ ^ zcm 2 (2.362) 

n,a p 


where we have used orthonormality. Therefore, the N th moment of the proba¬ 
bility distribution of measurements of A is: 


<^> 


(V’o | A N ip 0 ) 
(V’o I 

E n 

°n 


V’o) 

' 

e 

a 

(V’o I V’o) 




E 

l_ 

(Vo I V’o) 



(2.363) 


The moments (A N ' l j for IV = 0,1,2,. uniquely determine the probability dis¬ 

tribution of measurements of A. Therefore, we need only construct a probability 
distribution which gives the above moments. 

1. Let p(a„,to) be the probability that a measurement of A at to will yield 
the discrete eigenvalue a n . 

p(a n ,to) = , 7 | 7 . Z|c^| 2 (2.364) 

(V’O | V’O} a 

2. Let p(a cul to) be the probability that a measurement of ^4 at t 0 will yield 
a value in the continuum between a cv and a cv+ dw ( p(a cu ,to ) is actually 
the probability density). 

p(a cv ,to) = El^l 2 (2.365) 

(V’O | V’o) /3 

3. The probability a measurement of ^4 at to will yield a value not in the 
spectrum of A is zero. 
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For non-degenerate eigenvalues, the sums over a and (3 do not appear. 


Example: Consider one-dimensional motion (x-axis). Let 

A = p xop = --p- (2.366) 

i ax 

Then 

A(f)p — Pxop&p = — j = p4*p (2.367) 

i ax 

where p = constant eigenvalue. Therefore, 

^dx = ^ ^ p ( x ) = N p ellfr (2.368) 

First we show that p must be real. Let p = a + i/3. Then 

<j> p (x) = N p e iS r (2.369) 


The term e~^ x ^ h goes to oo for x -> +oo when /3 < 0 and goes to oo for x -» -oo 
when j3 > 0. Therefore, we must have (3 - 0 for condition (2) to be satisfied. 
Thus, 

(j) p (x) = N p e iC £ (2.370) 

with p any real number in the interval [-oo, +oo], which is therefore the spectrum 
of p. Note that the spectrum of p is entirely a continuum. Contrast this with 
the spectrum of L zop , which is entirely discrete. 

12pt] Now, 


(fa | <f> p ) = N;,N P J dxe-^e*# =N^N p h J dye lv{p ~ p) 

— oo —oo 

= N;,N p h(2n)5(p-p') = (2-rrh) \N p \ 2 S(p-p') (2.371) 

since the expression vanishes for p + p’. Letting N p = 1/\/27rfi, we obtain ON 
eigenfunctions 

u p (x) = e ?ir with ( u p < \ u p ) = S(p-p') (2.372) 

\Z2ith 

Notice that we are using p for the eigenvalue a cl/ as well as for the continuous 
index v. 

Now let ip(x,to) be a square-integrable wave function. The completeness of 
{u p (a:)} implies that 


oo 

i>(x,t 0 ) = J dpfj(p, to)u p (x) 


(2.373) 
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so that 


0(Mo) = J 0(p,t o ) (2.374) 

— oo 

ON implies that the expansion coefficient is 

oo 

C* Cl T* * px 

i>(p,to) = (u p | 0) = J -=e''T^(x,fo) (2.375) 

— oo 

which agrees with the Fourier transform theorem. Furthermore, 

p(p,to) = . ) . \i>(p,to)\ 2 (2.376) 

('00 | 0 O > 

using the recipe we developed earlier for p(p, ffi)- This result agrees with Postu¬ 
late 2 with p(p, to) a new notation for the momentum probability distribution. 

Notes 

1. Let me stress that the above derivations of p(a n ,to) and p(a C i/0o) are 

valid only if the eigenfunctions } are orthonormal: 

} | “I"' ) = Sn'nSa'a , (t^ | U ( JP ) = 5(l/ - , (u^ | ) = 0 

Although the eigenfunctions ju^, must be normalized (the contin¬ 
uum eigenfunctions are normalized to a Dirac delta function), 0(2, to) = 
0o(2) need not be normalized ((0o |0o) is only required to be finite and 
non-zero). 

2. Consider the two wave functions 0o(2) and 0o(2) - Z0o(x), where Z is 
some complex constant, 0o(2) an d 0o(2) determine the same probability 
distributions p(a n , to) and p(a cl/ ,to )■ 

Proof 

(00 | 00 ) = \Zf (00 | 00 > 

4“ ) = (ul Q )|0o) = ^(4 Q) |0o> = ^ 

Similarly, 

= Zd< w (2.377) 

Therefore, 

p(«n,to) = ■ ; T 7 7 |ci a) | 2 = , , ) , , |c^ a) f = p(a„,f 0 ) (2.378) 

(00 | 00 / (0o I 0o) 

If the wave function at to is known, then the probability distribution of 
measurements of any physical quantity at time to can be determined. 
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2.3. Collapse 

A Question 

A natural question now arises: How can one prepare a particle so that its wave 
function at to is completely determined (up to an arbitrary multiplicative con¬ 
stant, which has no physical significance according to note (2) above)? 

Experimental Result: If a measurement of A at time t yields the value a n (or 
a continuum value in the range a cv to a c „ +< 2 „), then a subsequent measurement 
of A performed immediately after the first measurement will yield the value a n 
(or a continuum value in the same range as before) with certainty. The second 
measurement must be made immediately after the first so that the forces acting 
on the particle do not have time enough to alter the state of the particle. Thus, 
a measurement tells us something about a particle right after the measurement 
- an immediate repetition of the measurement will yield the same result as did 
the first measurement. We note here that this assumption of repeatable mea¬ 
surements is an assumption which is it is not always possible to accomplish in 
the laboratory. 

This experimental result motivates the following postulate. 

2.3.1. Postulate 4: Collapse of the Wave Function 

Let , v ^)} be a CON set of eigenfunctions of the physical quantity A. Let 
ip(x,to ) = 'f’o(x) so that 

Mx) = Y, c i“ )u l Q) ( S ) + f d ^T, d i 0)u it ) (x) (2.379) 

n,a j3 

1. If a measurement of A at t 0 yields the value a n (we are assuming that this 
measurement is made with sufficient accuracy so that no other discrete 
eigenvalue lies within the experimental uncertainty), then the particle’s 
wave function right after this measurement is 

V«2) = £#>«£*> (5) (2.380) 

a 

that is, the wave function collapses to that part of ifo(x) which is an 
eigenfunction of A with eigenvalue a n (linear superposition of degenerate 
states or a single state if eigenvalue is non-degenerate). A measurement 
of A for the state if' 0 (x) will yield the result a n with certainty. Note that 
ip'oix) cannot be identically zero because a n can be obtained from the 
measurement only if 

p( a nHo) - tt ) . . E|c! a) |%° (2-381) 

WolWa 


124 



2. If a measurement of A at to yields a continuum value measured to an 
accuracy such that this value lies in the range a cv to a cl/+ di ,, then the 
particle’s wave function right after this measurement is 


v+dv 



(2.382) 


that is, the wave function collapses to that part of ipo(x) which is an eigen¬ 
function of A with eigenvalues in the range a cv to a cv+ dv (with certainty). 
Note that 'tp'o(x) cannot be identically zero because a measurement will 
yield a value in the range a cl/ to a cv+ dv only if 


v+di' 


f dvp{ Q'Cl'y to) 


1 

(•00 | V’o) 


v+dv 

f d„Y. 


(2.383) 


The collapse of the wave function has a simple geometric interpretation when 
the spectrum of A is entirely discrete, that is 

M2)=Y,<i a) tfH2) (2.384) 

n,a 

with no continuum eigenfunctions present, ipo belongs to the vector space of 
square-integrable functions and is a CON set of basis vectors in this 

infinite-dimensional vector space. I will draw only three of the orthogonal basis 
vectors - the other basis vectors are in orthogonal directions. In general, the 
coefficients c ^ are complex; however, my diagrams below are necessarily of a 
real vector space, c1°^ real. The superscript a is not needed for eigenfunctions 
corresponding to non-degenerate eigenvalues. 

Case #1: All non-degenerate eigenvalues. Eigenfunctions and corresponding 
eigenvalues are: 

Hi *-*■ a\ , zi 2 <->-02 , 113 «->• 03 (2.385) 

as shown in Figure 2.14 below. 

Let (ipo I V’o) = 1 an d V’o = C 1 U 1 + C 2 W 2 + C 3 U 3 +.as shown above. We then have 

p(ai,to) = |ci| 2 . If a measurement of A yields a\, the wave function collapses 
to the vector Ciiii, which is physically equivalent to u\ and similarly for 02 and 
a 3 . 
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Figure 2.14: Vectors in Hilbert Space 


Case #2: 2-fold degenerate eigenvalue and non-degenerate eigenvalue. Eigen¬ 
functions and corresponding eigenvalues are: 

++ a\ , ++ ai , U 2 •«-*• 02 (2.386) 

as shown in Figure 2.15 below: 



Figure 2.15: Vectors in Hilbert Space 

Let {ipo | ipo) = 1 and t/j 0 = + C 2 U 2 + . as shown above. We 

I ( 1\I^ I (o\ I^ 

then have p(ai,to) = c) d + c) d . If a measurement of A yields a 1 , the wave 

function collapses to the vector c[^u[^ +c^u^\ We also have p(ci 2 ,to) = |c 2 | 2 - 
If a measurement of A yields 02 , the wave function collapses to the vector C 2 U 2 , 
which is physically equivalent to U 2 - 

This simple geometrical picture does not hold when the spectrum of A has a 
continuum part because the continuum eigenfunctions are not square-integrable 
and therefore do not belong to L 2 , the space in which i/jq lies. The collapse of 
the wave function onto continuum eigenfunctions cannot be pictured as a simple 
projection onto basis vectors in L 2 . 
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For the following discussion, I will assume that the spectrum of A is entirely 
discrete so that jui Q ' ) j is a CON set of eigenfunctions. This is done only for 
simplicity of notation; all of the following results generalize in an obvious manner 
when A has both discrete and continuum parts to its spectrum. 

I would like to make a set of measurements on a particle so that, from the results 
of these measurements, I will know what the particle’s wave function is (up to 
an unimportant multiplicative constant) immediately after the measurements, 
that is, I would like to prepare the particle (by making certain measurements) 
so that its wave function is known at a specified time. 

Suppose we write 

ip(x,to) = i>o{x) = Y c( n )u{ n\x) (2.387) 

n,a 

In this situation, the eigenfunctions j are known from solving the eigen¬ 
value equation for A. However, if o(i) and the chave not yet been determined. 

A measurement of A at to yields the value a at- If oat is non-degenerate (eigen¬ 
function u^), the wave function collapses to ipb(x) = c'^'u^ and the wave 
function right after the measurement is known up to the multiplicative constant 
4 1} . If otv is degenerate (eigenfunctions uff the wave function 

collapses to 

i>'o(x) = Y c n )u n\x) (2.388) 

Ot = 1 

and the wave function right after the measurement is not known up to a multi¬ 
plicative constant because just knowing the eigenvalue a a i does not tell us the 
particular linear superposition of uff that results, that is, the co¬ 
efficients 4r\ c iv\ have not yet been determined. More measurements 

must be performed on the particle in order to collapse the wave function fur¬ 
ther. 

However, one must be careful in choosing which subsequent measurements to 
make. A measurement of another physical quantity B will usually disturb the 
wave function if' 0 {x), that is, collapse ifb(x) to ip'o(x), an eigenfunction of B 
such that the wave function ipg(x) after this second measurement is no longer 
an eigenfunction of A with eigenvalue a at. That means that the subsequent 
measurement of B will usually disturb the particle so that we destroy any prepa¬ 
ration of the particle accomplished by the first measurement (of A). 
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An outline of this argument is as follows: 

ip(x,t o) = ip o(5) 

Measurement of A yields cin 

Wave function collapses to ip' 0 (x) where Aip’ 0 (x ) = ajvV’o(^) 
Subsequent measurement of B yields 6 m 
W ave function collapses to ip'f{x) where Bip q(x) = 6mV’o(^) 
Is ipo(x) still an eigenfunction of A with eigenvalue a n? 

Assume that it is. Then we have 


Aip'f{x) = a N ip g(x) 

BAip'f (x) = Ba N ip'o(x) = a N Bip'f(x) = a N b M ip’o(x) 

ABi/jq (x) = Ab M ipo(x) = b M AijjQ(x) = b M a N ^o(x) 

[A,B]M(x) = 0 

Therefore, for ip'o(x) to be an eigenfunction of both A and B, we must have 
[A, B] if>Q (x) = 0 or [A, B] = 0. For arbitrary observable quantities A and B, 
this is usually not true, because [A, 5] is not necessarily zero. Thus, ip q (i) will 
usually not remain an eigenfunction of A. 

Definition: Let A and B be the hermitian operators corresponding to two 
physical quantities. A and B are said to be compatible if one can find a CON 
set of functions which are simultaneously eigenfunctions of A and B. We say 
there exists a CON set of simultaneous eigenfunctions of A and B if A and B 
are compatible. 

Note that two operators need not be compatible for them to possess some simul¬ 
taneous eigenfunctions. Compatibility requires that the two operators possess 
a complete set of simultaneous eigenfunctions. 

Therefore, if |um,n(5)| = CON set of simultaneous eigenfunction of A and B, 
then 

Av m?n(x) = a m v^l(x) , Bv^l(x) = b n v^ n (x) (2.389) 

where a labels the linearly independent eigenfunctions having the same a» and 
6 m - 


Comments 


1 . 


|um,n(i)} is a CON set of wave functions for which both A and B have 
precise values, that is, A A = 0 = A B for each wave function Vm} n (x). 
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2. Let 


(2.390) 


i>(x,to)=i>o(x)= Y 
m,n,oc 

Note that the eigenfunctions jum,n(5) j are known from solving the eigen¬ 
value equations for A and B. However, ipo(x) and c rn)n have not yet been 
determined. 

Suppose that we measure A and obtain the value clm- The wave function 
collapses to 

^o(x) = Y (2.391) 

n,<y. 

Suppose that we measure B and obtain the value b n. The wave function 
collapses to 

V>o (x) = Y c m?n v m?n( s ) (2.392) 

a 

Notice that ip'o(x) is still an eigenfunction of A with eigenvalue (Im- The 
measurement of B has not disturbed the value of A (when A and B are 
compatible). In this sense A and B are truly compatible. 

3. Let A and B be compatible. Let jwm^(5) j be any CON set of eigenfunc¬ 
tions of A. This is not the only CON set of eigenfunctions of A if some of 
the eigenvalues are degenerate - there are different possible choices for the 
linearly independent eigenfunctions belonging to a degenerate eigenvalue. 
The given eigenfunctions j'i4n^(a:)J need not be simultaneous eigenfunc¬ 
tions of B. The compatibility of A and B implies only that there exists 
some CON set of eigenfunctions of both A and B simultaneously. 

Suppose A and B are compatible. Suppose we measure A (obtaining the value 
<2 m) and then B (obtaining the value bjy). As in comment (2) above, the wave 
function right after these compatible measurements is 

4>o (x) = E c( m]n v m]n(x) (2.393) 

a 

If there are two or more ON eigenfunctions corresponding to (1m and bjy, (2) 
is still not completely known (up to a multiplicative constant) because we do 

not yet know the individual coefficients N i^m N i . ^ or suc h a case, other 

compatible measurements must be made on the particle until the wave function 
has collapsed to a function that is completely known (up to a multiplicative 
constant). 

Definition: Let A, B, C, ., Q be the hermitian operators corresponding 

to some physical quantities. A, B, C, ., Q are said to from a complete 

set of compatible observables if there exists one, and only one, CON set of 
simultaneous eigenfunctions of A, B, C, ., Q. (2 CON sets are not considered 
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different if the functions of one set differ from the functions of the other set 
by multiplicative constants of modulus unity). One can prove the existence of 
complete sets of compatible observables, but we will not do that here. 

Therefore, if {'t^ TOn ,. (x)} = CON set of simultaneous eigenfunction of A, B, C, 
., Q , then 

Av trnn „„{x) = ...(£) 

bm,Vp.mn.... (x') 

(x') — C-nV£rnn....{.x') 


Note that we have used the fact that if A, B, C, ., Q form a complete set of 

compatible observables, then that implies that there is only one eigenfunction 
(determined up to an arbitrary multiplicative constant) corresponding to given 
eigenvalues a^, 6 m , c n ,. 

1. It is possible for a single operator A by itself to form a complete set of 
compatible observables. The operator A need only have eigenvalues which 
are all non-degenerate: if the eigenvalues of A are all non-degenerate, 
then a given eigenvalue has only one linearly independent corresponding 
eigenfunction and there exists only one CON set of eigenfunctions of A. 

Two different complete sets of compatible observables A, B, C, . and 

A !, B\ C\ . need not have the same number of operators. 

2. For 

i>(x,to) = tpo(x) = Yj K imn...Vtmn...{x) (2.394) 

Note that the eigenfunctions {ve mn ,,,($)} are known from solving the 
eigenvalue equations for A, B, C, . However, V’o(i) and the expan¬ 

sion coefficients Kg mn _. have not yet been determined. 

Suppose we measure A and obtain ol- The wave function then collapses 
to 

</,'(*)= y mn... VLmn...(x) (2.395) 

Suppose we measure B and obtain 6 m- The wave function then collapses 
to 

1p'o(x)= Y K LMn...VLMn...(x) (2.396) 

n,.... 

Suppose we measure all the remaining quantities C,.... in the complete 

set of compatible observables A, B, C, .... and obtain the values cjv,. 

The wave function will finally collapse to 

<"■■(*) = Klmn...vlmn...(x) (2.397) 
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where no sum is present now. 

Notice that ipQ r '(x) is now completely known (up to a multiplicative constant 
Klmn .)• Thus, measurements made of a complete set of compatible observ¬ 

ables collapse the wave function to a simultaneous eigenfunction of all these 

observables - the experimental results aL,bM,cjy, . completely determine this 

simultaneous eigenfunction Vlmn...(x)- Thus, the wave function of the particle 
immediately after the measurements has been experimentally determined (up to 
a multiplicative constant), that is, the measurements have prepared the particle 
so that its wave function is known at a specified time. 
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Chapter 3 


Formulation of Wave Mechanics - Part 2 


3.1. Complete Orthonormal Sets 

Our earlier discussions give a simple criterion for observables to be compatible. 
In particular, we showed that 

A and B are compatible implies that [ A, B] = 0 

In addition we have the following theorem: 

[A, B] = 0 implies that A and B are compatible 

The proof of this theorem provides a practical way to construct a CON set of 
simultaneous eigenfunctions of two compatible observables. 

Proof: Let j (x)| be any CON set of eigenfunctions of A. 

Au^\x) = a n u^\x) (3.1) 

and assume that [ A , £?] = 0. The above eigenfunctions of A need not be eigen¬ 
functions of B. We will now show how to construct from the a CON 

set of simultaneous eigenfunctions of A and B - this will prove that A and B 
are compatible. 

(a) Consider a particular eigenvalue a n . If it is non-degenerate, then the follow¬ 
ing is valid: 

is the only linearly independent 
eigenfunction corresponding to a n 

Claim: is necessarily an eigenfunction of B also. 
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Proof: We have 


ABu W = BAu™ = Ba n u ^ = a n Bu ^ ( 3 . 2 ) 

Therefore, Bu^P is an eigenfunction of A with eigenvalue a n and since a n is 
non-degenerate, we must have Buip = (constant) . But this says that u^ 
is an eigenfunction of B. 


Therefore, eigenfunctions of A corresponding to non-degenerate eigenvalues are 
necessarily eigenfunctions of B also. 


(b) Consider a particular eigenvalue a n . If it is degenerate, then the following 
is valid: 


g-fold degeneracy implies that Un\ Un\ ••••, Vm'* 
are ON eigenfunctions of A with eigenvalue a n 
and any eigenfunction of A with eigenvalue a n 
can be expressed as a linear superposition of 
these g eigenfucntions. 

Now, consider the case where u^ is not necessarily an eigenfunction of B. We 
have 

ABu^P = BAuW = Ba n u = a n Bu ( n ^ ( 3 . 3 ) 

so that Bu^ is an eigenfunction of A with eigenvalue a n and can be expressed 
as a linear superposition of u^\ u ^\...., uffl: 

Bu^Hx) = X! ^ )w l a) ( $ ) fo r/?= 1 , 2 ,....,5 ( 3 . 4 ) 

a=l 

Relabeling indices: = (u^ | Bu^ j. Note that a is the first index and /? is 

the second index on both sides of this equation. 


b is a g x g matrix where the matrix elements of b^ are computed from the 
given ON eigenfunctions of A corresponding to the eigenvalue a n . 

( 3 . 5 ) 

The matrix b^ is hermitian, that is, the matrix is equal to the complex conju¬ 
gate of its transposed matrix: 

= K q) | Bu^P) = (Bu<p | u<«) = (u<« | Bu^)* = (3.6) 


b {n) = 


/ An) An) 
I U X1 U 12 

An) An) 
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Claim: We can find g linearly independent functions, each of the form 

i c a u^ (3.7) 

a=l 

which are eigenfunctions of B (they are obviously eigenfunctions of A corre¬ 
sponding to eigenvalue a n ). 

Proof: Let us show how to find the c a such that 

E C « M n a) 

a=1 


is an eigenfunction of B. We have 

9 , v 9 


B Z ) = 5 E c a ii^ where b = yet to be determined eigenvalue 


a=l 

9 


a=l 
9 


B E C * U n a) = E = E C « E b p n a U n ) = 6 E = b E C 0 U n ] 

a=l a=l a=l (3=1 a=l /3=1 

E E - * E <¥«?> 

a= 1/3=1 /3=1 

E «S” ( E = 0 = ( t *’»«) -ice- 0 

/3=1 \a=l ' / \a=l / 

for /3 = 1,2,This result is true because the it^, are linearly 

independent. 


Using the notation 


/ ci \ 

C2 


= column vector 


V Cg / 


this equation becomes 


6 (n) c=6c 


(3.8) 


(3.9) 


which is an eigenvalue equation for the gxg hermitian matrix b^ n \ Now ( b^ - 
b)c- 0 has a non-trivial solution for c provided that ( b^ - b) is not invertible, 
that is, 

clet(6 (n) - b) = 0 (3.10) 


det 


b \") _ b An) 
U 11 u "12 


/ U™) 

(«) /,(") 


b%> b^-b . . 


= 0 


(3.11) 


135 



The various possible eigenvalues b can be found by solving this algebraic equa¬ 
tion for b. For each b found, we can then find at least one non-trivial c satisfying 
b^c = be (just solve this equation for c with b given). It is a basic result of 
linear algebra that the eigenfunctions of a hermitian matrix are complete (the 
spectral theorem we stated in Chapter 2 is a generalization of this theorem to 
hermitian operators on the infinite-dimensional space L 2 ). This means that, 
for the g x g hermitian matrix b ^, one can find g linearly independent column 
vectors c, 



< C< 1} \ 


[ c i 2) ^ 


( c \ 3) \ 




( 2 ) 


( g ) 

c« = 

C 2 

, a® = 

C 2 

Aa) _ 

5 .? ^ 

C 2 


4 1} > 


4 2) J 


^ 4 9) > 


Several of these column vectors may correspond to the same eigenvalue b, while 
others correspond to different eigenvalues. We have thus shown how to find g 
linearly independent functions, each of the form 

i c a u W ( 3 . 12 ) 

a=l 

which are eigenfunctions of B (the eigenvalue is the value of b which determined 
this c) and of A (the eigenvalue is a n ). 

Let me summarize the results (a) and (b) obtained so far in the proof of the 
theorem. 

[A, B] = 0 and ji44(5) j is any CON set of eigenfunctions of A 

(a) If a„ is non-degenerate, then the only eigenfunction corresponding to a n 
is u^ and u^ is automatically a simultaneous eigenfunction of B. 

(b) If a n is degenerate (g-fold degeneracy), then u^\u^\ ....,u^ are the cor¬ 
responding ON eigenfunctions and u is not necessarily an eigenfunction 
of B. Now form the g x g hermitian matrix b^ n \ whose matrix elements 
are b^ = | u | Buif^, a, ft = 1 , 2 , ....,g. Solve the equation b^c = be. 

The eigenvalues b are obtained by solving det(6^ rl - ) - b) = 0. Knowing 
the eigenvalues b, we can then find g linearly independent column vec¬ 
tors c*- 1 -*, c*- 2 - 1 ,. ,5^ which solve the eigenvalue equation. The g linearly 

independent functions 

^n\x)=Y i c( J >) ui a \x) , ft =1,2,....,g ( 3 . 13 ) 

a=l 

are eigenfunctions of B (the eigenvalue is the value of b which determined 
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this 


p=(/3) _ 


cf \ 

v & / 


(3.14) 


and of A (the eigenvalue is a n ). The (fine’s which correspond to differ¬ 
ent eigenvalues b are necessarily orthogonal; the s which belong to 
the same eigenvalue b can be constructed to be orthogonal by the Gram- 
Schmidt process. All the <j>^’s can be normalized. Thus, one can always 


find s satisfying <j). 


(. 0 ) \ _ 


/ 


= 6 , 


a/3 • 


We now have a method for constructing a CON set of simultaneous eigenfunc¬ 
tions of A and B. For each eigenvalue of A, we can find g CON simultaneous 
eigenfunctions of A and B when the eigenvalue is g-fold degenerate {g = 1 means 
a non-degenerate eigenvalue). Doing this for all the eigenvalues of A yields a 
CON set of simultaneous eigenfunctions of A and B. The existence of a CON 
set of simultaneous eigenfunctions implies that A and B are compatible. 


Example: Construction of a CON set of simultaneous eigenfunctions of two 
commuting hermitian operators. 

Let the functions u^(£), u[ 2 \x), u- 2 (x), u 3 (x), 114 ( 3 ;), .be a CON set in Li 2 . 

Define the linear operators A and B by their action on this CON set: 


A 

B 

e_ 

n 

ta 

i _s__ 

n 

1 _S__ 

to 

to 

II 

to 

Bu^ 

Au 2 = 2 U 2 

Bu 2 = 2 U 2 

Au 3 = 3 u 3 

Bu 3 = 3 u 3 





Au n = nu n 

Bu n = nu n 






Table 3.1: Definition of A and B 


Note that U2(x),u 3 (x), 114 ( 3 ;), .are simultaneous eigenfunctions of A and B, 

while x/[\x),ii^\x) are degenerate eigenfunctions of A (eigenvalue = +1), but 
are not eigenfunctions of B. 

Notes: 

1. A linear operator A is completely defined by specifying its action on some 
CON set {^“^(S)}: Av^\ x) given for all n, a. This is true because an 
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(3.15) 


arbitrary (j>{x) can be written as 

<P(x) = Y, c n a)v( n a) (x) 

n,a 

Therefore, 

A<K$)= £c^A^(*) (3.16) 

n,ct v -v-' 

given 

and thus, the action of A on any (j){x) is determined. For example, 

\A, B] = 0 if [ A , B] (x ) = 0 for all n, a 

2. A linear operator A is hermitian if (<j> \ A(j>) = (A<j> \ <f>) for all <fr(x). But an 
arbitrary (j){x) can be written 

m = (3-i7) 

when {''(.f } j is some CON set. Therefore, A is hermitian if 

£4«>»<«>(i) AZciSKiS\x)\.lAZc^\x) E4f>»<f)(i) 

n,a m,/3 / \ n,a m,/3 

E rf" E cg> («<»>(*) I Av<£Hx)) - E 4“>* E 4f> I 4f>«) 

n,a ra,/3 n,a ra,/3 

Because this must hold for arbitrary coefficients we conclude that A is 

hermitian if | Avm\x )j = ^Av^\x) | Vm\x)^ for all n,m.a,l3. This 

is a simple test for hermiticity when the action of A on a CON set is known. 

Because A and B commute, we can find a CON set of simultaneous eigenfunc¬ 
tions. 


A’s eigenvalues 2,3,4,.... are non-degenerate and therefore the corresponding 
eigenfunctions uz(x) ,u%(x) ,U 4 ,(x), are necessarily eigenfunctions of B also. 

A’s eigenvalue 1 is 2-fold degenerate, (i), (x) are not eigenfunctions of 

B. We form the 2x2 matrix 6( n=1 ) where 

& $=K a) l 5 ^) (3 - 18) 

so that 

6 ( / 1 ) = (r4 1) |i?4 1) ) = (r4 1) |u ( 1 2) ) = 0 

j s*.? 5 ) = («< a) j t4 1} > = o 

6 ( 1 1 2 ) = (r4 1) |i?4 2) ) = (r4 1) |u ( 1 1) ) = l 
= ( u[ 2) | Bu [») = ( u[ 2) | 4 2) ) = 1 
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Therefore 


!,<-=«=(; i ) 


(3.19) 


and we have 

det(6^ 1 - 1 - b) = 0 =► det 
Case #1: b = +1 

6 (1) c + =c + => | !j* 
Case #2: b = -1 




(3.20) 


(3.21) 


(3.22) 


Therefore, + cjitp' 1 = c^(w^ 1 ' ) + u^) is a simultaneous eigenfunction of A 

(eigenvalue = +1) and of B (eigenvalue = +1) and +c 2 u i~' ) = c i i 11 ^ 

is a simultaneous eigenfunction of A (eigenvalue = +1) and of B (eigenvalue 
= -!)• 


To normalize these functions we note that 

(u< 1} ± u[ 2) | «f) ± u < 2 >) = 2 => 4 = ± (3.23) 

Thus, 

~^( u i 1) + u i 2) )) _ ’ u i 2) )’’ u 2,U3,M4,. (3.24) 

with eigenvalues 

+ 1(A),+1(B) + 1(A),-1(B) 2 (A,B) 3 (A,B) 4 (A,B) . (3.25) 

form a CON set of simultaneous eigenfunctions of A and B. 

Summarizing our results: Two hermitian operators are compatible if, and 
only if, they commute: compatibility is equivalent to commutativity. Compatible 
physical quantities are important when one wants to prepare a particle so that 
its wave function is known (up to a multiplicative constant) at a specified time: 
measurements made of a complete set of compatible observables collapse the 
wave function to a simultaneous eigenfunction of all these observables and the 
experimental results from these measurements uniquely determine the simulta¬ 
neous eigenfunction to which the wave function collapsed. 

There is another important use of commuting observables. Suppose we want 
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to find the eigenfunctions and eigenvalues of some physical quantity A. If we 
already know the eigenfunctions of a physical quantity B which commutes with 
A, then our task is greatly simplified because there must exist a CON set of 
eigenfunctions of B which are simultaneous eigenfunctions of A. 


Example: Free Particle in Three Dimensions(no forces present): We 

have 


Hop — 


Plov+P 2 


yop 


+ p 


zop 


2m 


(3.26) 


Let us find the eigenfunctions and eigenvalues of H op (discrete and continuum 
parts) using the eigenvalue equation 


H 0 p4>(x) = Ecj)(x) 


(3.27) 


Method #1: p xop ,p yop ,p zop , H op are obviously mutually commuting operators, 
that is, any pair of these operators commute. Therefore, we can find a CON set 
of simultaneous eigenfunctions of these 4 operators. Now, the eigenfunctions of 
Pxop , Pyop , Pzop have already been derived. 

eigenfunctions of p xop -*■ N x (y , z)e * with eigenvalues p x e (-oo, +oo) 

iPy V 

eigenfunctions of p yop -*■ N y (x, z)e h with eigenvalues p y e (-oo, +oo) 
eigenfunctions of p zop -*■ N z (x,y)e h with eigenvalues p z e (-oo, +oo) 


Clearly, the simultaneous eigenfunctions of p xop ,p yop ,p zop are 

Up*PypAx) = Ne^e'-^e^ = N e^ p * x+p * y+p * z) (3.28) 


where N is independent of x,y,z, form a complete set. Given p X iP y ,Pzi there 
exists (no degeneracy remains after p x ,p y ,p z given) only one linearly indepen¬ 
dent eigenfunction. Therefore, the functions u Pa . PyPz (x) are also eigenfunctions 
of H op . Finding the eigenvalues of H op is simple: 


II, 


Op U PxPyPz 


Pxov Pvo 


,+Pz 


2m 

IpI + pI+pI\ 

[ 2m J 


*P*PyPz 


'•PxPyP-. 


- Eu 


PxPyPz 


(3.29) 


so that 

„ (pI + pI+pI\ , , , oon . 

^=1 -—- I , P x ,Py,Pzt(-° o,+oo) (3.30) 

Thus, the spectrum of H op has only a continuum part, with the eigenvalue E 
anywhere in [0,+oo]. 


Method #2: Separation of variables(SOV): This is the method used to 
solve certain partial differential equations. We have 

H 0 p4>(x) = E<j)(x) (3.31) 
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We look for eigenfunctions satisfying earlier conditions (a) and (b) so that the 
eigenfunctions must not become infinite as |i| oo. 


Now, inserting differential operators, the eigenvalue equation becomes 


H op m = ^°P + p y°r +P *°r j m 

2 m \ dx 2 dy 2 dx 2 / 


(3.32) 


for fixed E. The allowed values of E are then determined by requiring that (f>{x ) 
does not become infinite as |x| -»■ oo. This is called a boundary condition. 


In this SOV method, we look for solutions of the form <j>(x ) = X (x)Y (y) Z (z), 
i.e., separated variables. Substituting into the partial differential equation we 


have 

4( yz 0 + xz 0 + xy SH (xyz) 

Dividing by XYZ (recall that <j>(x) is not identically zero) we get 


(3.33) 


/ h 2 1 <9 2 X\ + / h 2 1 <9 2 F\ + /_ ft 2 1 d 2 Z\ 
\ 2m X dx 2 ) \ 2 m Y dy 2 ) \ 2m Z dx 2 ) 


= E = constant 


function of x only function of y only function of z only 
=F!(x) =F 2 (y ) =F 3 (z) 


(3.34) 


This equation must be true for all x, y , 2 where x, y , z are independent variables. 
In particular, the equation must be valid as x varies while y and z are kept fixed. 
Thus, F[ (x) must be independent of x. In a similar manner, we arrive at the 
conclusion that F. 2 (y) must be independent of y and F 3 (z) must be independent 
of z. Therefore, 


Fi(x) = constant , F 2 (y) = constant , F 3 (z) = constant (3.35) 

We let 

fc2!.2 h 2 h 2 h 2 h 2 

(3.36) 


F 1 {X 0 = ^ . , Fs(z)= 3 


2m ' ' 2m v 7 2m 

where fci,/c 2 ,/c 3 are constants (in principle, complex constants). The quantities 


h 2 k 2 h 2 k 2 h 2 k 2 
2m ’ 2m ’ 2m 


(3.37) 


are called separation constants. We then obtain the eigenvalues in terms of these 
three unknown constants as 


h 2 k 2 h 2 k 2 h 2 k 2 

hj — -1-1- 


2m 2m 


2 m 


(3.38) 
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and the three equations 


(3.39) 

(3.40) 

(3.41) 

oo,|*| -» oo, the con- 

It is sufficient to write 

X = e +iklX , Y = e +ik2V , Z = e +ik3Z (3.42) 

because fci, k? , kj may be positive or negative. 

Therefore, 

0(5) = Ne +i(klX+k2V+k3z) 

E - (&i + + ^ 3 ) € + °°] 

k-[. k ‘2 , /u 3 pure real 

With fc,; = pi/h these results agree with those from Method #1. One might ask 
how we know that the eigenfunctions of the form 0(5) = X(x)Y(y)Z(z) give a 
complete set of eigenfunctions. Method #2 does not tell us. Only Method #1 
really shows that the eigenfunctions we have obtained are complete. 


dx 2 ' klX 

*L - -e Y - 

dx 2 2 

d 2 Z _ ,2 y 

^ = " fc3z " 


► X = e ±iklX 
. y = e ±ik 2V 

Z = e ±ife3Z 


Since X, Y, Z must not become infinite as |x| -*■ oo,|j/| -»■ 
stants k -[, h> 2 , fc ,3 must be pure real (positive or negative). 


3.2. Time Development 

We have discussed in detail how a measurement collapses the wave function in 
Chapter 2. By measuring a complete set of compatible observables, one prepares 
a particle so that its wave function immediately after the measurements (at time 
to) is known (up to a multiplicative constant). 0(5, to) is therefore known (up 
to a normalization factor). Suppose another measurement is made at some later 
time ti (ti > to). The probability distribution of results of such a measurement 
can be determined if one knows the wave function 0(5,ti) right before this 
measurement is made. We must therefore specify the time-development of the 
wave function between measurements, that is, between to and ti, so that 0(5, ti) 
can be calculated in terms of 0(5, to). Naturally, the time-development depends 
on the forces acting on the particle. 

3.2.1. Mathematical Preliminaries 

1. Let A and B be linear operators. If (0i | Afif) - (0 1 | B0 2 } for all 0i,02, 
then A = B. 


142 



Proof: We have 0 = (0 \ \ A<j> 2 )-{<f>\ \ B(f> 2 ) = (0 1 | ( A - B)(f> 2 ). Now choose 
0 1 = (A - 5)0 2- Then 0 = ((A - B)cj ) 2 \ (A - 5)0 2 ) => (A - B) 4> 2 for all (j > 2 
or A - B = 0. 


2. Let A and B be linear operators. If (0 \ Acj)) = (0 \ B<fi) for all 0, that is, 
if (A) = ( B) for all states, then A = B. 

Proof: Let (f> = (j>\ + X(f > 2 , where A is an arbitrary complex constant. 
Therefore, 

{4> | A4>) = (</>i + A</>2 I A(4> 1 + A 4> 2 )) 

= (</>i | Acf> 1 ) + |A| ( 4> 2 | Acj) 2 ) + A (</>i | A<f> 2 ) + A* ( <t> 2 | A(j>\) 

(cj) | Be/)) = (</> 1 + A</>2 I B((f> 1 + Xcj ) 2 )) 

= (0! I S0!> + |A| 2 (02 I S02> + A (0! | S0 2 ) + A* (02 | 50!} 

But (0i | ^40i) = (0i | 50 1 } and (0 2 | A0 2 ) = (02 | 50 2 ). Therefore, 

A (0i | A02> + A* (02 | A0i) = A (0i | 50 2 ) + A" (0 2 | 50i} for all A (3.43) 

and 

A = 1 -*■ (01 I A(f> 2 ) + (02 I Al0l} = (01 | 502} + (02 I 501) (3.44) 

A = i -»■ (01 | A02) - (02 I A0l) = (01 I 502} - (02 | 50i) (3.45) 

Therefore, (0 1 | A02) = (0i | 50 2 ) for all 0i, 02 so that A = B. 

3. We have 

= ihdij , = 0 , [Pi,Pj] = 0 (3.46) 

We can cleverly rewrite these equations by letting A be one of the operators 
x,y,z,p x ,p v or p z : 


where 


[xi, A] = ih 


dA 

dp. 


[A,pj] = ih 


dA 

dxj 


(3.47) 






(3.48) 


Let / = f(x,p ) = T, (ABCD...) = sum of terms of the form ( ABCD ...), 

where each operator A , 5, C.5, is one of the operators x, y, z,p x ,p y or 

p z . For example, 

f(x,p) = xp x xz+p 2 y x (3.49) 
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Now, 


[Xi ,f(x,p)] = [xi,Y,(ABCD...)] = Y [%i, ABCD...] 

= Y (l>*, -4] BCD... + A [x u B] CD... + AB [x u C ] D ...+) 


E(< 


<94 <9 <9^ 

ih — BCD... + Aih — CD... + ABih—D 

dpi dpi dp. 

= ihY [ — BCD... + A—CD... + AB — D...+] 

\ dpi dpi dpi / 

= i h 4- Y {abcd...) = 


■9 


(3.50) 


and in a similar manner we have 

= ih ~^~ ( 3 - 51 ) 

Note: When differentiating an operator of the form ABCD...., one must 
maintain the order of the operators A,B,C,D,... because the operators 
do not necessarily commute. 


Example: Let 
Then 


f(x,p) = Xp x XZ+p 2 y X 


Of 

[x, /(x,p)] = ih —— = ih (x(l)xz + 0) = ihx 2 z 
OPx 


(3.52) 

(3.53) 


We now proceed to consider the time-development of the wave function between 
measurements. This time-development will be our final postulate. We will mo¬ 
tivate this postulate by deriving it from several rather reasonable assumptions. 


Consider a particle under the influence of a conservative force: 

H = ^ + V {x) (3.54) 

Let us try to obtain a differential equation which will determine the behavior 
of ip(x,t) with time. 

Assumption (a) - tp(x, t ) is completely determined by knowing the wave 
function at some initial time to, that is, if(x,to) at all x determines ip(x,t) 
for all time t. In particular, we are assuming that dip(x,to)/dt need not 
be given as an initial condition. Thus, the differential equation obeyed by 
if(x,t) must be first order in time: 

^ = 4 [0(x,p,t)] ip , P=-V (3.55) 

at, in i 

where 9(x,p,t) is some complicated operator which, in principle, might be 
non-linear. The factor 1 /ih has been separated out for later convenience. 


144 



Assumption (b) - If i/ji(x,t ) and ip 2 (x,t) are solutions of (3.55), then 
any linear superposition Ai ipi(x,t) + \ 2 ip 2 (x,t) is also a solution. This is 
essentially Postulate (lb) from earlier ( linear superposition) . Therefore, if 
we have 

dipi 1 dip 2 1 „ . ,, 

~ a 8 * 1 “ d ~W ~ (3 ' 56) 

then this implies that 

+ \ 2 'ip2) = + A 2 ^2) (3.57) 

ot in 

for all constants Ai,A 2 - Therefore, 

1 d 

— 0(Ai^>i + A2'02) = + ^2^2) 

in at 

= Xl l* +X *l* = ^Ar^r + ^A 2 ^2 

at at in in 

or 

0(Ai^i + \ 2 ip 2 ) — Ai^i + \2dil>2 (3.58) 

which says that 9 = 9(x,p,t ) must be linear. 

Assumption (c) - i/i(x, t) must be normalizable for all t in order to be a 
physical wave function (Postulate (la) from earlier). The simplest way to 
guarantee this is to require 

J~ d 3 xip* (x,t)ip(x,t) = J~ d 3 xip* (x,to)ip(x,to) (3.59) 

where the initial wave function ip(x,to) is assumed to be normalizable. 
This should be regarded as a convenient choice for preserving normaliza¬ 
tion. We note that this property fails if we were considering relativistic 
quantum mechanics! 

Notation: We have 

(V’i(t) | ip 2 (t)) = J d 3 xip*(x,t)ip(x,t) (3.60) 

where the variable x has been omitted on the left hand side because it is 
integrated over so that the resulting inner product depends only on t. 

Therefore, assumption (c) can be written 

(ip(t) | ip(t)) = (ip(t 0 ) | V’(io)) (3.61) 

or 

j t | ip(t)) = 0 (3.62) 
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which gives 


0 = 


or 

{9ip{t) | %Pit)) = (ip(t) | 9ip(t.)) for arbitrary ip(x,t ) (3.64) 

Note that ipix , f) can be practically any square-integrable function because 
%j}{x,to) may be chosen arbitrarily. 

This result says that 6 = 9(x,p,t ) must be hermitian. 

Assumption (d) - This is the important assumption. We must construct 
a quantum theory which, for macroscopic phenomena, reduces to the equa¬ 
tions of classical physics (Newton’s laws of motion). This constraint on 
the quantum theory is called the correspondence principle , and we will im¬ 
pose it on our theory by assuming the following, which is experimentally 
verified for microscopic and macroscopic phenomena: 


J d 3 xip*(x,t)ifj(x,t) 


dt 

I • 

I mt) 

\ dt 


dip*(x,t) 


dt 


1 p(x,t ) + 1 p*{x,t) 


dip(x,t) 

dt. 


mt) \ 

dt I 


-4 {Hit) I ip(t)) + (V’(i) I Hit)) 

in in 


(3.63) 


Even though the results if individual measurements do 
not obey the equations of classical physics in detail, 
the average values of measurements obey the classical 
equations of motion 


Basic Assumption (will determine 9 = 6*(5,p, t)): 

s w=+(F ‘ > = (-^f) where s (l>)= ( 

These equations correspond (are the quantum analogues) 
equations of motion: 


(3.65) 


m I 

to the classical 


dpi „ dVix) dxi pi 

—— = r , = --- and —— = — 

dt dxi dt m 


The average values 


(pi) 


(ipjt) I pHjt)) 

(V’(i) I V’W) 


and [xi) 


(ipjt) I Xji/jjt)) 

(Ht) I V’W) 


(3.66) 


(3.67) 
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depend on time because ip(x, t) depends on time. The operators Xi and 
Pi = - ihd/dxi do not change with time. Now we can write 


But 


d <-0(t) | Pij>(t)) 

dt (ip(t) | ip(t)) 

d (ip(t) | Xjipjt)) 

dt {ip(t) | ip(t)) 


{m | ‘Wm) 

| ip(t)) 

i WO 

m {ip(t)\ip(t)) 


(3.68) 

(3.69) 


d_ 

dt 


(ip(t) | ip(t)) = 0 since (ip(t) \ ip(t)) is constant 


(3.70) 


implies that we cancel the factor 


1 

(VKO I V’(O) 


in these equations. Therefore 





dV(x) 

dxi 



m 


so that 


(3.71) 


(3.72) 

(3.73) 



dV(x) 

dxi 



j t \Piip(t)) 


dip 

dt 


pA | + 


Pi 


dtp 

dt 


- 4(^1 Pii>)+ 4 (ip | Pi9lp) 

in in 

-4 (V’ I Spiip) + 4 {i> I PiH) 

in in 

-\{^\[9,Pi]^)=-^U ih ^~A 

in in \ dxi I 

de 


-U 


dx. 


-iP 


(3.74) 


Since the above equation is true for any ip{x,t) we have 

dd(x,p,t ) dV(x) 
dxi dxi 


(3.75) 
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In a similar manner we can also show the following: 


— | Pii>(t)) = 4 (V’(i) I Xii>(t)) 

to at 

I dip 

~dt 


Xilp | + 


1 ~dt 


= w i ®iV , >+4 (V’ i Xi6tp) 

in in 

= ~4 I 0Xii/j) +\{ip\ XiOip) 
in in 

-~{^\[o,xim = ^U *^4%) 

in in \ opi I 


= 


(3.76) 


Since the above equation is true for any ip(x,t) we have 

d&(x,p, t) = ft 

9ft TO 

We can now determine the operator 9(x,p,t): 


(3.77) 


d8(x,p,t) 

dp x 

dd(x,p,t) 

dp y 

dd(x,p,t) 

dp z 

d9(x,p,t ) 
dxi 


— -* 9(x,p,t) = 

TO 

d9i(x,p,t) = ft, 

dp y TO 

<96>i(ftftf) _ ft 
dp z m 

d 03 (x,y,z,t) = 
dxi 


if- + 9i(p v ,p z ,x,y,z,t) 

2 TO 

Py 

9i(x,p,t) = — + 9 2 {p z ,x,y,z,t) 
2 TO 


02(x,P,t) = +9 3 (x,y,z,t) 

2 TO 


< 9 y (i) 

dxi 


9 3 (x,y,z,t) = P(x) + c(i) 


(3.78) 

(3.79) 

(3.80) 

(3.81) 


where c(f) is an arbitrary function of U i.e., it is independent of x and p. 
Therefore, 

9(x,p,t) = 4^ + ^(S) + c(t) (3.82) 

Am 

is the solution of the partial differential equations which 9(x,p,t) must 
obey. Note that 9(x,p,t ) being hermitian implies that c(f) must be a real 
function of t. 


Claim: c(t) has no physical significance and can be chosen to be zero. 
Proof: For c(f) any arbitrary function, the wave function obeys 
dip c (x,t ) 1 


at 


ih 


^E + V (z) + c(t) 


l/>c(x,t) 


(3.83) 
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subject to the initial condition ip c (x, to) = ip(x, to), which is a given square- 
integrable function. Consider the function N c (t)ip c (x,t) with 


+ i f c(t')dt' 

N c (t) = e ‘o 


(3.84) 


where c(t) real implies that |iV c (t)| = 1. Therefore N c (t = to) = e° = 1. This 
means that N c (t)i!) c (x,t) and ip c (x,t) obey the same initial conditions at 
t = t 0 - 


Because N c (t) is just a multiplicative factor independent of x (it happens 
to depend on time), N c (t)ip c (x, t ) and i/> c (5, t) determine exactly the same 
probability distribution and average values. 

Thus, we can use N c (t)ip c (x,t) as the wave function rather than ip c (x,t) 
and no physical values will be altered. 


The differential equation obeyed by N c {t)^) c {x,t) is 


d(NM 8N C l Ar d^ c , l„ Rr ., 

o, — e\. ^c + Nc ~ — c\t)N c ip c + 9N c ij) c 

ot ot ot a in 


= ^r(0-c(t))N c ip c = 4 


ih ih L 2m 

Letting N c {t)i^ c {x^t) = ^(x,t) we have 


^ + V(x) 


Nclpc 


dip(x,t) 1 

dt ih 




4>(x,t) 


which proves that no physical values will be altered if we use 


(3.85) 


(3.86) 


9(x,p,t) = ^-+V(x) (3.87) 

2m 

with no c(t) present. Thus, 

S(x,p,t) = ^- + V(x) = H(x,p) (3.88) 

2 m 

It is the Hamiltonian operator! 

The partial differential equation just obtained determines the time-development 
of the wave function between measurements. 


Rather than postulate assumptions (a),(b),(c),(d), which went into the 
above derivation, we will just postulate the result. 
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3.2.2. Postulate 5: Time Development of the Wave Func¬ 
tion 


For a particle in a conservative force field, the wave function obeys the time- 
dependent Schrodinger equation: 


ih dip(x,i) = = 

at 


p 2i +v 


ip(x,t) 


= - — V 2 il>(x,t) + V(x)ip(x,t) 

2 m 


(3.89) 


From this equation, one can determine t ) at any time t by knowing ^>(5, to), 

the wave function at some initial time t 0 . 

We note the important fact that the wave function’s time-development is com¬ 
pletely determined by the time-dependent Schrodinger equation during periods 
of time when no measurements are made. While a measurement is being made, 
the wave function does not obey the time-dependent Schrodinger equation - 
instead, the wave function emphcollapses to an eigenfunction of the observable 
being measured. The eigenfunction to which the wave function collapses is, 
in general, unpredictable - only the probabilities for specific results of a mea¬ 
surement can be determined. Contrast this with the completely predictable 
time-development during periods of time when no measurements are made - a 
time-development determined by the time-dependent Schrodinger equation. 


3.3. Structure of Quantum Theory 

3.3.1. Initial preparation of the wave function at t 0 

Measuring a complete set of compatible observables at to determines ij>(x,to) 
immediately after these measurements (up to a multiplicative constant). 

Then time development is governed by 

Htf) - (3.90) 

which takes ip(x, to) (given) into ip(x,t) (determined). 

Measurement of A at time t ( A need not be related to the observables measured 


150 



at to) gives 


p(a n ,t) 


1 

(V>(t) | i>(t)) 


EI4 a) (0| a = E 


(V’(t) | ^(t)) 


««».<»> = w ,( f ) W , (i )) 

/ ,jyv., (V>(*) 1^(0) 

' (V’(i)lV’W) 


(/3)2 

1 "V (V-WI^W) 


(3.91) 

(3.92) 

(3.93) 


Immediately after the measurement, the wave function collapses to a function 
which depends on the result of the measurement. 

Notice the conceptual difference between the time-dependent Schrodinger equa¬ 
tion 

Hif = ih ^ (3.94) 

at 

and the time-independent Schrodinger equation Hip = Eip. The time-dependent 
Schrodinger equation determines the time-development of any wave function 
between measurements. The time-independent Schrodinger equation is just one 
of many eigenvalue equations - it is the eigenvalue equation for energy and 
determines the possible results of an energy measurement. 

The time-dependent Schrodinger equation is motivated by the correspondence 
principle - average values obey the classical equations of motion. 

One can reverse our previous arguments and show that the time-dependent 
Schrodinger equation implies that average values obey the classical equations of 
motion. 


3.3.2. Basic Problem of Quantum Mechanics 

Given ip(x,0), find ip(x,t). 

Let (5) j be a CON set of eigenfunctions of H ( Hu^ = E n u^). For 
simplicity of notation, we will assume that the spectrum of H is entirely discrete. 
Our results generalize in an obvious manner when a continuum is also present. 

Now we can expand ip(x, t ) at fixed time t in terms of a CON set as 

= X) c ( n a) (t)u { n a) (x) (3.95) 

n,a 
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Therefore, we have 


dip dci a \t ) (a) . 

= X, ^ — of^ u ™ ( x ) = H ^ 


= ^ ci a \t)Hu^(x) = £ c^\t)E n u^\x) 


,(«) 


,(<*b 


,( Q )c 


( 3 . 96 ) 


The linear independence of the |rt^ Q ^(J)| implies that we can equate coefficients 
in the expansions: 


ih 


—^— = E n c\Y ( t ) for all a, n 


( 3 . 97 ) 


This equation is easily solved: 


i a) (i) = 4 a) ( 0)e-* B »‘ 


( 3 . 98 ) 


so that 

tf-OM) = E ci Q) (0)e-^" t 4“)(f) (3.99) 

n,a 

This is a very important result. It represents the general result for expan¬ 
sion of an arbitrary wave function in terms of energy eigenfunctions. 

The coefficients (0) can be found from the given -^(5,0) since 

V>(2,0)=E4 a) (0 )u^(x) (3.100) 

n,a 

so that 

c n O) (0) = ( u i a) | i>(t = 0)} (3.101) 

Therefore, we have determined il>(x,t) in terms of ip(x,0). 


We note that the solution just obtained for ip(x,t) can also be obtained by 
solving the time-dependent Schrodinger equation by the method of separation 
of variables. We have 

AJ, ft 2 

ih Yf=-^ 2 if + v (xW ( 3 . 102 ) 

Trying a solution of the form ip(x,t ) = U(x)T(t) we get 


dT, 


dt 


h 


ih — U =- SJ Z U + V{x)U T 


2 m 
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We thus have function of t = function of x. Because x and t are independent 
variables, each side of this equation must equal a constant A (the separation 
constant). We then have 

ih ^- x -h(-^ 1 UtV(i)u )-v HU (3103) 

or HU = XU so that U must be an energy eigenfunction with energy eigenvalue 
A = E and 

jm 

ih%- = \T-+T(t) =T(0)e“^ At (3.104) 

at 

Therefore, for a particular value of E (one of the eigenvalues) we have 

^x,t) = T E (0)e-^ Et U E (x) (=£&*>( 0)e-i Ent ui a \x)) (3.105) 

which represents a complete set when we use all linearly independent solu¬ 
tions for all possible E values. The general solution to the time-dependent 
Schrodinger equation is obtained by summing over all such particular solutions: 

= Z T E(0)e~ ±hEt U E (x) = £ 0)e~^ t u ( n a \x) (3.106) 

E n,ot 

as obtained earlier. 


Some Notes: 


(a) Let ip(x,t) be an arbitrary wave function. Its expansion in terms of energy 
eigenfunctions is: 

^(x,t) = ^ci a \t)ui a \x) (3.107) 

n,a 

where 

c ( n a \t) = ci a \ 0)e-* £ »‘ (3.108) 

The probability of measuring E n at time t is: 


E|4 Q) (t)| 2 

p{En,t) = m)\m) 


(3.109) 


But (ip(t) | V’(O) = (V’(O) I ^(0)) and c£*\t) = c^(0) . Therefore, 


E|c£ a) (0)| 2 


Therefore, p(E n ,t ) is independent of time for any wave function (corresponds 
to energy conservation). 
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(b) Let ip(x,0) = u^\x), that is, the particle is in an eigenfunction of energy 
at t = 0. Therefore, 

0(S,t) = u^\x)e~^ Ent (3.111) 

for any time t, that is, we just get multiplication by a phase factor! The particle 
therefore remains in the same eigenfunction of energy for all t if it is initially 
in an eigenfunction of energy. Eigenfunctions of energy are therefore called 
stationary states. In general, if ip(x, 0) is an eigenfunction of some operator 
other than energy, then 0(2, t ) will not remain an eigenfunction of this operator 
for t + 0. 


Properties of Stationary States 

In a stationary state we have 

4>(x,t) = u ( n a) (x)e~^ Ent 


Hu £*> = E n u™ 


(3.112) 


1. The probability density (probability per unit volume) for position mea¬ 
surements is given by 


p(x,t) = 


(V’(i) I V’(i)} 


(3.113) 


and 


p(x,t) = 


u\ ’{x)e 




2 

( 5 ) 


\fp(x,0)\ 


= p(S,0) 


\ ip ( t )) (0(0) 10(0)) (0(0) 10(0)) 

Therefore, p(x,t) is independent of time for a stationary state. 

2. Let A be any physical observable. For simplicity of notation, we will 
assume that the spectrum of A is entirely discrete. Let ji>m' ) (2)j be a 


CON set of eigenfunctions of A: 

Av^ix) = a m v^\x) 

Therefore, 

p(.Umi t) 


(3.114) 


= £ 

0 

= £ 

0 

= £ 

0 


(V>(0 10(t)> 


( u | | 


(V’(o) 

1 V’(o)) 

(0(0) | 

0(0)) 

(V’W 1 V’(i)! 

) 


= £ 


(u ( n a) e-i E ^ 


u ( n a) e-i E ^) 

< 

;0(o) 10(0); 

) 


i 

- V 

(wi Q) | 

(«£> 

I m! q) ) 

Z-/ 

/3 

(V’(i) 1 

V’W) 



— p(flm, 0) 


Thus, p(a m ,t ) is independent of time for a stationary state. 
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3. Let A be any physical observable. Then 


{A)t 


(ipit) 1 _ (ul a) e | Auj^e-*^) 

I i’i.t)) (^(0)|^(0)) 

I Au " a) ) <V>(0) | Aip(0)) 

(m\m) " (m\m) ' h -° 


Therefore, (A) is independent of time for a stationary state. 


Example of a Non-Stationary State 

Let ip(x, 0) be a linear combination of two non-degenerate energy eigenfunctions 

4'(x, 0) = ciUi(i) + c 2 it 2 (S) (3.115) 

where 

Hui = EiUi for i- 1,2,E 2 > E\ , ( Ui \ Uj) = Sij for i- 1,2 (ON) (3.116) 
We have the energy level diagram as shown in Figure 3.1 below. 


E 2 

Ei 



Figure 3.1: Energy Levels 


Therefore, 


with 


and 




iE^t i E 2 1 

ciUi(x)e~ h + C2U 2 (x)e~ h 


| 4>(t)) = (^(0) | m) = |o| 2 + |c 2 | 2 


p(Ei ,t) 


|ci | 2 

|ci| 2 + |c 2 | 2 


p(E 2 ,t ) 



independent of t. 


(3.117) 

(3.118) 


(3.119) 


Let A be any physical observable, and let us calculate the time dependence of 
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(A) for the above state. 


(A) = 


(0(0 I Ai/j(t)) 

(0(0 I 0(0) 

( iE-^t iE 2 t I / iE^t iE 2 t \ V 

Ciiii(5)e s - +C2«2(i)e L41ciiti(S)e ^ +C2M2(i)e sj) 

hl 2 + M 2 

1 / |ci| 2 (til | Atii) + |c 2 | 2 (u 2 I Au 2 ) 

|ci| 2 + |c 2 | 2 l +C 2 Cie 1 2 ft (u 2 | Aui) + cjc 2 e _ ( 2 * («i \ Au 2 ) 


1 


|ci| {ui \ Am) + \c 2 \ (u 2 \Au 2 ) 


i(E2~Ei)t 


i(E 2 -Ei)t 


|ci | 2 + |c 2 | 2 \ +c 2 cie ^ (u 2 \Aui) + c\c 2 e v ^ (Aw 2 | u{)* 

1 / hi 2 (til | Aui) + |c 2 | 2 (u 2 I Au 2 ) 

|ci | 2 + |c 2 | 2 \ +C2Cie 1 0 (u 2 | Aui) + c\c 2 e~ ( ^ (u 2 | Aui)* 


Now let 


i? = 


Oil 2 (til | Atti) + |c 2 | 2 {u 2 I Au 2 ) 

M 2 + M 2 


(this is a real number (because A is hermitian)) and 

Z = 


c 2 c 1 (u 2 \Au 1 ) i$ 

=-^^— = /e s 

00 + M 


which is a complex number. Therefore, 


(3.120) 


(3.121) 


(A) = R + Ze KE ^ El)t + Z*e~ ' (E2 * El)t 


= R+\Z\ 

- R + 2\Z\ cos 


(M£ +? ] [(MU +i ] 


e'L n ’J + e "L ft 

( E 2 - Ei)t 


h 


+ ? 


(3.122) 


so that (A) oscillates in time with a frequency u> = (E 2 - E-[ )/h when |2T| ± 0, 
which requires (u 2 \ Au\) + 0. 


If A = x (position operator), then (x) is harmonic in time with frequency ui = 
(E 2 -Ei)/ h. If the particle is charged (for example, an electron), then its average 
position will oscillate harmonically in time and the particle will therefore radiate 
at angular frequency co = (E 2 - Ei)/h when C 2 C 1 (u 2 \ Au\) + 0. Suppose an 
electron in an atom makes a transition from energy level E 2 to energy level 
Ei. We will show later in this chapter that the wave function for such an 
electron is a linear combination of u 2 and iq with neither c 2 or c\ zero. Thus, 
the electron will radiate at angular frequency oj such that huj = (E 2 - E\) when 
(u 2 | xu\) + O.This requirement for the radiation to occur is known as a selection 
rule - it is a condition on the states u 2 and u\ for a radiative transition to be 
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possible. 


Of course, the electron does not radiate forever as it goes from level E 2 to 
level Ei. We will show later in this chapter that C 2 eventually becomes zero 
so that the electron is eventually in the lower energy state completely and no 
radiation is emitted thereafter (the transition is then over). If we assume one 
photon is emitted during the transition, energy conservation implies Ei p hot.on - 
E 2 -Ei = hoj where w = angular frequency of the radiation. We therefore obtain 
the Einstein relation E p hoton = hw\ 


3 . 4 . Free Particle in One Dimension (motion along 
the x-axis) 


We have 

H = (3.123) 

2 m 

We will use p to denote the eigenvalues of the operator p xop . 

Now, [H,p xop ] = 0 implies that there exists a CON set of simultaneous eigen¬ 
functions of E[ and p xop . The eigenfunctions of p xop are (from earlier) 


0 (x) = 


\Z2nh 


, p 6 [-00, +00] 


(3.124) 


Thus, {u p (x)} is a CON set of simultaneous eigenfunctions of H and p xop - 


Suppose we are given 'ifix, 0). The problem is then to find if(x,t) for all t. We 
may expand ip(x,t) in terms of {w p (:r)}: 


00 

ij>(x,t) = J dpijj(p, t.)u p (x) 


(3.125) 


where the ?/>(p, t) are expansion coefficients which depend on the time at which 
the expansion is made. 

Because u p (x ) is an energy eigenfunction, the time dependence of f){p,t) is 


given by 

='t’iP, 0)e , e = tt- 

Zm 

(3.126) 

Therefore, 





(3.127) 
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il>(p,0) may be expressed in terms of the given tp(x,0) as 

oo 

ip(x, 0) = J~ dpijj(p,0)up(x) 

— oo 

so that 

oo 

(V I = 0)> = J dpip(p, 0) (v I u p) 

— oo 

oo 

= J dpt]}(p, 0)S(p' - p) = 4>(p', 0) 

— oo 


Thus, 


that is, 


i>(p, 0) = (tip | = 0)) = J dxu*(x)ip(x,0 ) 


^(P,0) = J 


dx 

Tits' ’ /,(I ’ 0) 


Thus, given ^>(#,0) we can calculate ip(p, 0) and then i/j(x,t). 


(3.128) 


(3.129) 


(3.130) 


(3.131) 


The probability distribution of momentum measurements is given by: 


where 


p(p,t) 


(0(f) \u p )(u p | 

(Ht) I V’W) 


(3.132) 


(«p|V>(t))=^(p,t)=^(p,0)e 

= tp(p, 0) x (a phase factor) 


(3.133) 


Therefore, 


p(p,t) 


i>(p,t) | 
(V’(t) | V’W) 


V>(p,o)| 
(V>(*) I V’W) 


p(p,0) 


(3.134) 


For a free particle, the momentum probability distribution does not change with 
time. This is momentum conservation for a particle with no forces acting on it. 
In general, the spatial probability distribution changes with time. 


Now, 


ip(x,t) = J 



ip(p,0)t 


h 



(3.135) 
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where 


= ^ and to = , (3.136) 

ft 2 mh 

ip(x,t) is a superposition of plane waves and is called a wave packet. 

For each plane wave: 


( px p^t \ 

~h 2 mh J _ i(kx-Ult) _ 


- plane wave with k 


— = ft = ^ 
A ' ft 


A = 


2nh 

P 


27 TV = W = 


P 


P 


P 


2 mh Aitmh 2 mh 

The planes of constant phase move at the phase velocity 


(3.137) 

(3.138) 


vphase 


= Xu = 


2m 


(3.139) 


The phase velocity has a different value for each p in the superposition. The 
factor 1/2 in the expression 


- JL 

Vphase ~ 

2m 

may seem surprising because, for classical particles, 


(3.140) 


_ Pclassical 
vclassical ~ 

m 


(3.141) 


However, one must remember that the entire wave packet ip(x,t) describes 
the particle. The pertinent quantity is the so-called group velocity of the wave 
packet: 



(3.142) 


where the averages are for the entire wave packet. The group velocity character¬ 
izes the entire wave function and is the quantity that corresponds to v c i ass i ca i. 


Often, one would like to estimate the behavior of ifix, t ) without explicitly do¬ 
ing the integral over the plane waves. The following very general method of 
approximation is useful for such estimates. 


3.4.1. The Method of Stationary Phase 

Let 

oo 

Ip(x,t ) = J dpg(p,x,t) (3.143) 

— OO 

where g(p , x, t) is a complex function such that 

g(p,x,t) = G(p,x,t)e i ^^ (3.144) 
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with 


0 < G(p, x, t ) = real amplitude of g(p, x , t) 
j(p,x,t) - phase angle of g(p,x,t) 


Therefore, 

oo 

if>(x,t) = J dpGfaxtfe*'***) (3.145) 

— oo 

Now we assume that G(p, x , t) is sharply peaked at p = po and is appreciable 
only in a small range A p about po as shown in Figure 3.2 below. 



We assume that A p and po do not depend on x and t. However, the value of G 
at po and the detailed behavior of G depend on x and t. 


(a) Estimate of of ( x,t ) values for which \4>(x,t)\ is a maximum 


For given x and t, 4’( x i t) ~ 0 if y(p, x, t ) changes quite a bit as p varies over the 
A p range (for such a case, cosy (p,x,t) and siny(p, x,t) oscillate a great deal 
over A p and the integral « 0 because of cancellations - recall that G(p 1 x,t ) > 0. 


\i/’(x,t)\ will be a maximum if y(p, x,t) changes a negligible amount over the 
A p range (for such a case, the entire integrand does not change sign over A p 
and there are no cancellations). Therefore we want 


dl{p,x,t) 

dp 


= 0 


P=P 0 


(3.146) 


as the condition for ( x,t ) values for which |^(x, t)| is a maximum. This just 
says that the integrand’s phase is stationary at po- 


Estimate of (x,t) values for which \ip(x,t)\ is appreciable 

\4’(x,t)\ will be appreciable, that is, non-negligible, for all ( x,t ) values for which 
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e ij(p,x,t) (-[ oes no t var y over m ore than one cycle in the A p range. If e* 7 ^ p,x,t ^ 
varied over much more than one cycle in the A p range, the integral would be « 0 
because of cancellations. Therefore, for appreciable or non-negligible \ip(x,t)\ 
we must have 


|[change in -y(p, x,t) over Ap]| = A p 


dl{p,x,t) 

dp 


< 27r 


P=P 0 


(3.147) 


Comment: is certainly non-negligible when \i/;(x,t)\ is a maximum. 

This is consistent with the above conditions because \ip(x,t)\ is a maximum 
when 

d'yjp, x, t) 
dp 


= 0 


(3.148) 


and this clearly satisfies the condition for ip(x,t) to be non-negligible. 


3.4.2. Application to a Free-Particle Wave Packet 

We have 

V>0M) = 0)e#-^f (3.149) 

— oo 

Let 0) = \4>{Vi 0)| e la( ' p ^ h be non-negligible in a A p range about p = po- For 
convenience, the phase of 0) is written as a(p)/h. A p measures the spread 
in the momentum probability distribution. We have 


4 >(x,t) = J 



V>(p, 0) 




(3.150) 


We have separated the integrand into an amplitude and a phase factor. The 
values of ( x,t ) for which \i^{x,t)\ is a maximum are given by the stationary 
phase condition 


d ipx p 2 t + oi(p) \ x pt + 1 da(p) 

dp \ h 2 mh h ) h mh h dp 


Therefore, ij)(x,t) is peaked at x = x pea k(t ) where 


Xpeak(t) = — ~ 

m 


da(p) 

dp 


p=p 0 


that is, the peak of the wave packet moves at constant velocity 

- 

Vpacket ~ 


(3.151) 


(3.152) 
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Notice that the phase angle of i/j(p,0) determines the position of the peak at 
t = 0. i/j(x,t ) is appreciable at all ( x,t ) satisfying 


A p 


A p 


d / px p 2 t 
dp \ h 2 mh 


a(p)' 


(x Pot t 1 

da(p) 

l h mh h 

dp 


< 2i r 


< 27T 


Ap\x Xp ea k{d^)\ — 


or we must have 

. 2nh h 

\ x - x peak(t) \< — = — 

for ip(x,t) appreciable as shown in Figure 3.3 below. 


(3.153) 



Figure 3.3: ij)(x,i) appreciable 


The maximum value of \x - x pea k(t)\ (with x such that il>(x,t) is apprecia¬ 
ble) is Ri h/Ap. But Ax, the spread of ip(x,t) is not less (see figure) than 
ma,x\x - x pea k(t)\ for %/j(x,t) appreciable. Therefore, Ax > h/Ap, which is con¬ 
sistent with uncertainty principle. 


3.5. Constants of the Motion 

The time-development of the wave function is determined by the time-dependent 
Schrodinger equation: 

= (3.154) 

at 

Let A = A(x,p) be the hermitian operator corresponding to some physical quan¬ 
tity. We will assume that this operator does not change with time (A is inde¬ 
pendent of t). 
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The average value of A, 


(3.155) 


M v = M*) I Hit)) 

' (V’(i)IV’W) 

depends on time through the time dependence of ip{x, t). Recall that (ip(t) \ ip(t)) 
(f/»(0) | ip(0)) is independent of time. Thus, 


d . .. 1 d 

dt (i/>(t) I VKO) dt 
1 




mnmM N + M) 

| ip(t)) (( ih H ^ 


^) + H A k H ^)) 


' 1 (- W’ I M) + (V- 1 ah ip)) 


(i/;(t) | V’(t)) 
l i 
(ip(t) I V’(i)) 

(V>(t) | V’(i)) 


(- {ip I HAip) + (V> I Aff 0» 




(3.156) 




— (^(t) | A0(t)) = (V’(t) 


in 


(3.157) 


Definition: The physical observable A is conserved, that is, A is a constant of 
the motion, if 

— (^4) = 0 for any ip(x,t) (3.158) 

dt 

Recall that d (B) /dt = 0 for any observable B if the average value is taken in a 
stationary state. B is conserved if d(B) /dt = 0 for all possible non-stationary 
states as well. 


Notes 

1. A is conserved if [A, H ] = 0, which follows immediately from the definition 
and the preceding results. 

2. A is conserved if A and H have a CON set of simultaneous eigenfunctions. 
This follows from (1) and the fact that commutativity is equivalent to 
compatibility. 

3. If A is conserved, then the probability distribution of measurements of A 
( p(a n ,t) and p(a cl/ ,t)) is constant in time for any ip{x,t). Recall that the 
probability distribution of measurements for any observable is constant in 
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time for a stationary state. 


Proof : A and H possess a CON set of simultaneous eigenfunctions }• 

For simplicity of notation, we assume that the spectra are entirely discrete. 
Then 


and 


Thus, 


A u ( n a 2(x) = a n u<£l(x) 
Hu^ix) = E m u^{x) 


„(<*) 


(*) 


p(a n ,t ) = 


(a) 

Cnm 


2 


(a) 

C-nm 


( 0 )" 


(V’(i) I V’O)) (V'(O) | V’(O)) 


= p(an,0) 


(3.159) 


(3.160) 


4. Let A be conserved. If ip(x, 0) is an eigenfunction of A with eigenvalue a n , 
then ip(x,t) remains an eigenfunction of A with eigenvalue a n for all time. 
This follows immediately from note (3) and the fact that p(a n ,t ) = 1 if 
i/j(x,t) - an eigenfunction of A with eigenvalue a n . 


Example 

Particle in a conservative force field 

H= V -V + V{x) (3.161) 

2m 

[H, H] = 0 =>■ H conserved (energy conservation) (3.162) 


Example 

Free particle (no forces present) 


H= ^ (3.163) 

2m v ; 

[pi,H] = 0 =>■ pi conserved (linear momentum conservation) (3.164) 

Example 

Particle in a central force field 

H = —- + V(r) where r = |J| (3.165) 

2m 
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Consider L z = xp y - yp x . We have 


[L Z ,H] = 


xv _ vv + 2jL + 2*- + V(r) 
XPy VPx, 2m 2m 2m 


XPy, 


Px 
2 TO 


+ [a :p y ,V(r)] - 


VPx 


Py_ 

' 2 TO 


- [j/Ps, V(r)] + othertermswhicharezero 
= T [x,p 2 x\p v + x[p v ,V{r)] 

- ^ i [y,pl\px-y[p x ,v{r)] 

= ^-(ih—)Py + ) 

2m m oy 

1 rh Py \ < h dv \ 

- 7T~\ lh ~ )Px - y{-in — ) 

zm m ox 

, dV dV. dr dr.dV 

= zhx{y—--x—-) =ihx(y--x — ) — 
ox oy ox oy or 

= ihx(y- - x~)~w~ ~ 0 
r r or 


Thus, [Lz,#] = 0. Because 


H= tl + V (r) (3.166) 

2 TO 

is unchanged when 2 and x are interchanged or when x and y are interchanged, 
we conclude that [ L X ,H ] = 0 and [L y ,H] = 0. Therefore, [ L Z ,H ] = 0 or L z is 
conserved (angular momentum conservation for a central force). 

We now consider a particular (solvable)physical system in great detail. 


3.6. Harmonic Oscillator in One Dimension 

We have 


dV{x) 

dx 


Therefore, 


= F x = -kx , k > 0 , V(x) = -kx 


H= ^ + \kx 2 
2 to 2 


Classical 


d 2 x k 


- - X => X = AsillLdt + B COS Lot , ui = \— 


dt' 2 m 


(3.167) 

(3.168) 

(3.169) 
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Let x - 0 and dx/dt = vq at t = 0. Then B = 0 and A = vq/ui so that 

x=— smuit (3.170) 

u 

The classical motion is therefore bounded x e [-x max , +x max \ where £ max = A = 
Vq/ui. We therefore expect the spectrum of H to be entirely discrete when we 
do the quantum mechanical problem. 


Quantum Mechanical: We have 


d_ 

dt 

d_ 

dt 


{x)=±-{[x,H]) = 
in 

(Px)= -j{[Px,H]) 

in 


" ( x ) = ~ k (x) 


(3.171) 

(3.172) 


The time dependence of (a;) is easily found: 


dt 2 


(x) 


Id 

— "77 UM 
m dt 



(3.173) 


Therefore 


(x) = ^4sinwf + Bcosuit with uj 


so that (x) follows the classical trajectory! 



(3.174) 


Let us now look for the stationary states of the harmonic oscillator. These are 
the energy eigenfunctions and are important for two reasons: 

1. The corresponding energy eigenvalues are the only possible results from 
an energy measurement. 

2. The time dependence of any ip(x, t.) describing a particle in the presence 
of a harmonic oscillator force can be obtained easily. 

We will find the energy eigenfunctions and eigenvalues by two very different 
methods: 


(a) Differential equation method 

(b) Operator algebra method 

Let us express the Hamiltonian in terms of the classical (angular) frequency 
ui = yjkjm.: 

„ 2 1 

^ + -mwV , H<j>{x) = Ect>{x) (3.175) 

2m 2 


166 



3.6.1. Differential Equation Method 

We have 

Hcj){x) = E<j>(x ) =►- ^ + -mw 2 x 2 cf>(x) = E(j){x) (3.176) 

2 to dx 2 2 

We solve the differential equation for 4>(x) with E an arbitrary constant and 
find allowed values of E by requiring that </>(cc) not become infinite as |x| -»■ oo. 
This will give us the entire spectrum of H - discrete and continuum parts. 


Step 1 : Introduce dimensionless variables and parameters. 


h 2 d 2 (f>(x) 1 2 2 ,, \ r-w/ \ 

^ d ^ 2 + -mu 2 x 2 (j>{x) = E(j)( x) 

1 d 2 6(x) mix 9 , . . 2 mE , . . 

> 7 - 7 '\ - —x 2 Mx) = ——— 

(nw) dx 2 h ^ > h 2 ) 


Define 

/» , 2 E 

y = \ / ——x and £ = -— (3.177) 

V h hoj 

(both dimensionless). Therefore, we get 

0 + ( £ -2/ 2 )^ = O (3.178) 

Step 2 : Factor out the asymptotic ( y -» ±oo) behavior of </>. Let y -* oo. Then 
e - y 1 x, -y 2 so that we have the equation 

= 0 (3.179) 

dy 2 

As can be easily seen by direct substitution, this asymptotic equation is solved 

by 

<tb = y a e ^ y2 (3.180) 

for arbitrary constant a. But cf> cannot be infinite as y -* oo. Therefore, the 
asymptotic solution 

y a e + ^ 2 (3.181) 

must be discarded. Therefore, asymptotically the solution is 

(jj^y^e-h 2 (3.182) 

We now try to find an exact solution of the form 

4>(x) = <t> [yf^r-x) = e ~^ V F (y) (3.183) 
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This form is motivated by the form of the asymptotic solution. The equation 
which F(y) obeys is obtained by substituting this form into 


We have 


d 2 4> 

dy 2 


+ (s-y 2 )^ = o 


d<f> 
dy 
d 2 (j) 
dy 2 


-Vn -If , 2 dF 

-ye 2i/ F + e 2 « - 

dy 


-e 2y F + y 2 e 2V F - ye 2V 


„ 2 dF 
dy 


-ye 


•■dF 

dy 


+ e 


■■ d 2 F 
dy 2 


so that the equation for F(y) becomes 


d 2 F dF 
dy 2 ^ dy 


+ (e-l)F = Q 


(3.184) 


(3.185) 


(3.186) 


(3.187) 


which is Hermite’s differential equation. Step 3: Solve Hermite’s differential 
equation by the power series method Taylor where we expand F(y) about y = 0. 
Let 

F{y) = f j A k y k (3.188) 

k =0 

and substitute into the differential equation. We have 


0= Y.k(k-l)A k y k ~ 2 -2yf i kA k y k - 1 + (e-l)f i A k y k (3.189) 

k =0 k =0 k =0 

Series begins at k=2. 2^ fc 

Relabeling index (k->k+2) k ^ 

- E (k+2)(k+l)A k+2 y k 


0= ^ /{(l- + 2)(fc+ 1 )4 +2 -2kA k + (e-l)A k } for all y (3.190) 

k =0 v -v-' 

Each of these coefficients must therefore be zero 

Therefore 

Ak +2 = TT~~~rrTT~ L ]s A k for fc = 0,1,2,3,. (3.191) 

(fc + 2 )(fc + 1) 

this is called a recursion relation. Using the recursion relation 

Aq given => ^ 2 , Ai, Aq, . all determined 

A\ given =► 2 I 3 , A 5 , A 7 , . all determined 
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Aq and Ai are arbitrary. These are the two arbitrary constants which always 
appear in the most general solution of a second-order differential equation. Thus, 


F(y)= E A kV k+ E A kV k (3-192) 

fc=0,2,4,6,.... fc=l,3,5,.... 

= F cve „(y) =F odd (y) 

(determined by Aq ) (determined by A± ) 

Note: If An = 0 for some N , then An +2 = An+a = An+6 =.= 0. 


Step 4: F(y) given above solves Hermite’s differential equation for any given 
value of e. The allowed values of 


2 


(3.193) 


are determined by requiring that <j) not become infinite as y -* 00 . Now 

= = e~i y2 F(y) 


(i) If F(y) is a polynomial of finite degree (the series terminates after some 
term), then 4 >{x) asymptotically goes as 

e -hv 2 [^finite power] Q (3T94) 

This is an acceptable 


(ii) Suppose the series for F(y) does not terminate. Then as y -*• 00 , the terms 
with large k dominate in 

F{y)=f j A k y k (3.195) 

k=0 


For k -* 00 


Therefore, 


But, 


A k +2 ( 2 k - e + 1 ) 2k 

A k {k + 2){k + 1 ) k->oo k 2 

F even {y ) = E A kV k => 

k= 0,2,4,.. A k 

Fodd(y) = E 

k= 1,3,5,.... 


2 

k 


2 

k 

2 

k 


2 00 v 2n v k 

0 +y _ V -_= V — _ 

n =0 n ' k=0,2,4,.. (y) ! 


E B k y k with 

large even 
k 


B k+ 2 1/ (tt)! 

1 /( 1 )! 


1 2 
- + 1 ~ k 


(3.196) 
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and 


ye 


+y‘ 


_ y y _ _ y y 

n^o n\ fc=1 ^ 6) .. (^)! 

c k+ 2 1 /(^)! 


^ Cfcy^with 


largeodd 
k 


Cu 


!/(¥)! 



Therefore, as y -* oo 

Feven(y) e +v , F odd (y ) -► ye +?/ 
if the series does not terminate. Therefore, 

0 = e-^ 2 F(y)^^ 

even & 2 y and 0 O(W -* ye + 2 y 

These solutions diverge as y -*■ oo and therefore are unacceptable. 


(3.197) 


(3.198) 


Conclusion: F(y) must be a polynomial of finite degree, that is, the series for 
F(y) must terminate after some term! This gives us the allow values of e. We 
have 


= (2fc-e+l) 
k+2 (k + 2)(k+l) Ak 

Therefore, F even terminates if Ao = 0 (-*• F even = 0) 
even N\. Therefore 


(3.199) 

or 2N-[ - e + 1 = 0 for some 


Feven ~ A 0 + A 2 y 2 + . + A Nl y Nl (3.200) 


Therefore, F odd terminates if A\ - 0 (-»■ F odd = 0) or 2N 2 -£+1 = 0 for some 
odd N 2 . Therefore 

F odd = Ai + A 3 y 2 + + A N 2 y N2 (3.201) 

Now F = F even +F odd . Therefore F terminates if both F even and F odd terminate. 


This means we need 

2N\ -£+1 = 0, even N x AND 2N 2 - £ + 1 = 0, odd N 2 (3.202) 

which CANNOT be satisfied simultaneously. Therefore 

(1) Ao = 0 and 2 N 2 - e + 1 = 0 for some odd 7V 2 => £ = 2N 2 + 1 

(2) Ai = 0 and 2- £ + 1 = 0 for some even N-\ => £ = 2 Ni + 1 

The allowed values of £ are therefore 


and 


£n = 2N + 1 , 7V = 0,1,2,3,.... 


(3.203) 


£N 


ze n 

hui 


E n 


hoj 

~2 


(27V+ 1) for TV = 0,1,2,3,.... 


(3.204) 
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The corresponding Fpj(y) is obtained from 


where 


oo 

fm(«) - E 


k =0 


(AT) = (2fc - gjy + 1) .(AT) = 2 (k-N) (Af) 

fc+2 (fc + 2)(fc + 1) (fc + 2)(fc + 1) 

A even N) _ r\ _ AoddN) 

u — U — /I 


odd k 


even k 


Therefore, 

N even: 


[ i + 2 2(0-iV) 4 2 2 (0-JV)(2-JV) 

° [ +y6 2»(0-*0(2-A0(4-AQ + _ +y N term 


N odd: 


fAn f + 3 2(l-iV) 5 2 2 (l- j V)(3-Af) 

*<»> ■<’{ 


The energy eigenfunctions are 


4>n(x) = e 22/ F N (y ) 


where 


Then 


mu 

y = \/~^x 


H(j>N = EN<t>N with Sat = hu(N + 1/2) , TV = 0,1,2,3,.... 

Notes 


(3.205) 


(3.206) 

(3.207) 

(3.208) 

(3.209) 

(3.210) 


1. The energy spectrum is entirely discrete and there is no degeneracy. 

2. F]y(y) is a polynomial in y of degree N. N even means only even powers 
of y occur and N odd means only odd powers of y occur. 

3. Fpf(y) obeys the differential equation 


d F n dF N 

- 2 y —— + 2NF n = 0 


dy 2 a dy 
which is Hermite’s differential equation. 
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4. F]y(y) = = Hermite polynomial of order N, when the arbitrary 

constants Aq and Ai are chosen such that the coefficient of y N is 2 N . 


Examples 

Ho(y) = A ( 0 N ~ 0) {1}: y° coefficient is 2° = 1 so that = 1 and thus 

H 0 (y) = 1. 

Hi(y) = A \ N ~ 1 ' 1 {y}: y 1 coefficient is 2 1 = 2 so that = 2 and thus 

Hi(y) = 2 y. 

H 2 (y) = AqN = 2) |l + y 2 j = {l - 2 y 2 }: y 2 coefficient is 2 2 = 4 

so that = -2 and thus H 2 (y) = 4 y 2 - 2 . 

Note that the Hermite polynomials are real functions of y. 

5. The normalized eigenfunctions of H are 

u N (x) = C N H N {y)e~^ (3.211) 

where |Cjv| is determined from (ujv|uat) = 1- We have 

OO OO 

1 = (u N I u N ) = J dxu* N {x)u N {x) = \C N \ 2 J dx[H N (y )] 2 e~ y 

— OO —OO 

V = ^ dy = >/f* 1 = ^ J dy[H N (y )] 2 e~ v 

— OO 


6 . {un(x)} n=012 is a CON set with (ttjv'1 u^) = S' N N 


$N'N = ( wjv ' | un) = J dxC* N ,H N '{y)e 2V CNH N (y)e 2 

— OO 

- OO 

-Cn'Cn f dyH N ,(y)H N (y)e- y2 


h 

mcu 


In particular, 

OO 

f dyH N ,(y)H N (y)e- y2 = 0 (3.212) 

— OO 

for N' + N. Thus, the Hermite polynomials are orthogonal with respect 
_ 2 

to the weighting factor e y . 

7. E n = huj(N+ 1/2) 
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(a) Eq = huj/2 = ground state energy. Classically, the oscillator could 
stay at x = 0 with zero velocity, which would be the minimum energy 
state (energy = 0). Quantum mechanically, the uncertainty princi¬ 
ple implies that the oscillator cannot be precisely at x = 0 with p x 
precisely zero. Thus, the minimum energy > 0. 

(b) En+ i - En = hu} independent of N. The energy levels are evenly 
spaced. 

8 . Ground State (7V = 0): 


u 0 (x) = C 0 H 0 (y)e~>v = C 0 e~*' 
Cq is found as in note (5): 


1 = 1 Co\\l — 
muj 


f dy[H 0 (y )] 

-oo 

IO>| [dye-v a = \C 0 \ 2 J- 

V moj J V 


moj 

y = \ / ; % 


2 0 -y 
e y 


^x (3.213) 


Choosing Co to be a positive real number, we have 

C/4 a 




(3.214) 


(3.215) 


Let tp(x,0) = uo(x) => tp(x,t) = uo(x)e 1 * = Uo(x)e 1 = for a particle in 
the ground state for all t. Therefore, 


p(x,t) = 


\^(x,t) | 

(V'(l) I V’(i)) 




1/2 


(3.216) 


This is independent of time, as expected for a stationary state. Classically, 
Xmax occurs when p x = 0 (a turning point). Therefore, 


1 


2 E 


7-1 Pr 1 2 2 x 2 2 

hj = -— + -mcu x = -mcu => x“ = -- 

2 m 2 2 max max mto 2 

For E = Eq = ftw/2 we have a;^ ax = h/mu). Therefore, 

. . /mw\ 1/2 

p(ai,f) = l——I e 

for the ground state as shown in Figure 3.4 below. 


(3.217) 


(3.218) 
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classical motion is 
confined to this 
region 


Figure 3.4: Probability Function 


Now the classical motion is confined to the region x e [-x max ,+a: max ]. 
Notice that there is a non-zero probability to find the particle outside the 
classically allowed region! This may seem strange because x > x max => 
V(x) > E n=0 => KE < 0! 

However, if a measurement tells us that the particle is not in the classical 
region (shine light over the classical region and look for the particle), then 
the wave function will collapse to the wave function shown in Figure 3.5 
below (measurement is made at to). 



Figure 3.5: Wave Function After Measurement 

This is no longer an eigenfunction of energy. Therefore, if the particle is 
known to be somewhere outside the classical region, the particle does not 
have a definite energy (A E + 0) and the statement V(x) > Em=o is no 
longer applicable. 

9. Let 0) be arbitrarily given. It need not be an energy eigenfunction 
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of H. What is ip(x,t ) for a harmonic oscillator? 

= Yj C N (0)e~ l ^~u N (x) 


N =0 


= c-T X Civ(0)e-^V(*) 

JV=0 

where if at = huj(N + 1/2) and the CV(0) are determined from 
ip(x, 0) = ^ CV(0)ujv(x) =► Cjv(O) 

N =0 


(3.219) 


oo 

(ujv | ip(t = 0)) = J~ dxu* N {x)'ip{x, 0) (3.220) 


3.6.2. Algebraic Method 

Given 


H = — + -mw 2 a: 2 
2m 2 


(3.221) 


where x and p x are hermitian operators satisfying \x,Px\ = ih. We will now 
solve the energy eigenvalue problem H(f>E = E4 >e by using only 

(1) the hermiticity of x and p x 
and 

(2) the above commutation relation 

We will not need the explicit representation 


h d 

Px — — 


dx 


required by working in the x - p representations. 


(3.222) 


Because H is a sum of a p 2 and an x 2 term, we will try to write H as the 
product of two factors, each of which is linear in x and p x . We have 





- x - i\ - p x 

2 V 2m i 





1 


. w .to 


- X + l\ - Px ~ - x + -+ l — xp x - l- 

2 V 2 m l 2 2 m x 2 2 


= H + [x,p x ] = H ~~Y 


(3.223) 


PxX 
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where we must be careful to preserve the ordering of the non-commuting oper¬ 
ators x and p x . Therefore 


H = 



hbJ 

2 



(3.224) 


We now define a new operator 


/TOW 

a = \ /- x + i 

2 h 


1 


2 mhuj 

Because x and p x are are hermitian, we have 


Px 


a = 


mix 
2 h 


-x -1 


1 


2 mhuj 


Px 


The Hamiltonian then becomes the simple expression 

H = hui(a + a + 1/2) 

The following commutators will be important in our discussion: 


[«,a + ] = 


2 h 


-x +1 


1 /TOW 

Px, \ / -7^2; - * 


2 mhu) 


2 h 


1 


2 mhu) 


Px 


• k lP - x] - s (a)=1 

or [a, a + ] = 1. 

[a, 77] = [a, hw(a + a + 1/2)] = hui [a, a + a] 

= ftw ( aa + a - a + aa) = hix ((a + a + l)a - a + aa) = huia 

or [a, 17] = hto a. 

[a + ,H] = [a + ,hui(a + a + 1/2)] = hex [a + , a + a] 

= hui (a + a + a - a + aa + ) = hui (a + a + a - a + (a + a + 1)) = -hu> a + 

or [ a + ,H] = -hixa + . 


(3.225) 

(3.226) 

(3.227) 

(3.228) 

(3.229) 

(3.230) 


These commutation relations imply the following: 
1. E > huj/2 with equality if acpE = 0 


Proof: we have 


0 < {a<f> E | a<p E } = {(pE I a + a<t> E ) = \<Pe 


(3.231) 
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(3.232) 


But {(f>E | 4>e) > 0 {(f >e not identically zero). Therefore, 

(-) > 0 =► i? > — with equality if a<f>E = 0 

V ruo 2) 2 


2. a + (j>E + 0 for any E 


Proof: we have 


(a + <f> E I a + (f> E ) = (4> e | aa + (f> E ) = {<t> E I (l + a + a)<t> E ) 

E 




Now 


=► ( a + (j)E | a + (j>E ) * O,thatisa + 0£: * 0 


(3.233) 


3. a + (f>E is an energy eigenfunction (a + <f>E + 0) of H with eigenvalue E + hu> 


Proof: we have 

H{a + (f>E ) = ([ H,a + ] + a + H) (f> E 

= {huja + + a + E ) <f>E = (E + huj){a + (f>E ) 

4. a<f>E = 0 or a(f>E is an energy eigenfunction {a(f>E + 0) of H with eigenvalue 
E - hio 


Proof: we have 

H{a(f>E ) = {[H,a] + aH) (f> E 

= {-huia + aE) (f>E = (E - huj){a + (f> E ) (3.234) 

If a(f>E + 0, then this equation implies that a(f> E is an energy eigenfunction 
of H. 

Note: Because a + increases the eigenvalue E by an increment hco (it creates 
added energy hut), a + is called a raising operator or a creation operator. Because 
a decreases the eigenvalue E by an increment hu> (it annihilates added energy 
hu>), a is called a lowering operator or a annihilation operator. 


Given a specific <f>E, we can form the sequences: 

(f>E , a<f>E 5 a 2 (f) E , a 3 (f)E , ■■■■ 


E E—huj E-3hco E-bhu 
pi E+huj E-\-3huj E-\-§hoj 

<!>e , a + (j>E , (a + ) 2 (f>E, ( a + ) 3 (f>E ■ 
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But the energy eigenvalues must be > hui/2. Therefore, the first of the above 
sequences ( E,E-hui,E-2huj, E-Shiv ,....) must terminate (otherwise, we would 
eventually obtain eigenvalues less than hu>/2. This termination occurs if there 
exists an n ( 0 , 1 , 2 ,...) such that a n (j)E is an energy eigenfunction (and therefore 
non-zero) with eigenvalue E - nhco and 

a n+1 (j)E = 0 , a n+2 (f>E = 0 , a n+3 (f>E = 0 , .... (3.235) 

Result (1) above then implies that 

E - nhui = => E = hui{n + 1/2) (3.236) 

This shows that if one is given an eigenvalue E, then it can be written E = 
hu)(n + 1/2) for some non-negative integer n. It remains to be shown that 
hui(N + 1/2) for any N = 0,1,2,.... is an eigenvalue. But this is easy since from 
above we have 

E = huj(n+ 1/2) (3.237) 

for some non-negative integer n. Forming the two sequences above, we find that 

huj(n - 1/2 ),hu)(n - 3/2),., — (3.238) 

and 

huj(n + 3/2), huj(n + 5/2), huj(n + 7/2),. (3.239) 

are also allowed energy eigenvalues. Thus, 

E n = huj(N + l/2)with N = 0,1,2,... (3.240) 


yields the entire spectrum of H. The spectrum has now been obtained without 
using the explicit representation 


Px = 


h d 
i dx 


(3.241) 


Let c/)e(x) = un(x) with (uet \ ujv) = <5jv' at- 


Claim: All the energy eigenvalues are non-degenerate if the N = 0 eigenvalue 
(Eq = huj/2) is non-degenerate. 


Proof: It is sufficient to show the following: If Ejy for some N > 0 is degenerate, 
then Ejy-i is also degenerate, that is, i?jv-i non-degenerate implies that En is 
non-degenerate. 

If Ee is degenerate, one can find at least two orthogonal eigenfunctions (uff and uff) 
with eigenvalue Ee and luff I uff\ = 0. However, 


1(1) n +„A2)\ _ L(l) 

f— - 


( e n 

1 \ 

LX 1 ) „( 2 )\ 

yijy CL CLUjy J — ( Wjy 

\ hid 

2) N j 

\ hui 

2 / 

\ U N U N J 


=o 
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Therefore 0 = | a + auff j = | avffi j. But TV > 0 implies that E N > 

ft.w/2 which implies avff t 0 and duff t 0. Therefore, avff and avff are 
orthogonal eigenfunctions (not identically zero) corresponding to the eigenvalue 
En- i, which is therefore degenerate. 


Let us show by explicit construction that En=o = hu)/2 is non-degenerate. We 
can then conclude that the entire spectrum of H is non-degenerate. 


Ground State: N = 0 , Eq = hui/ 2. We have 


ip hu} n 

E 0 = — o au 0 = 0 


(3.242) 


which gives all eigenfunctions for N = 0. Now 


a = 



V 2mhuj 


Px = 



+ 


(3.243) 


using 


h d 


Px = -~r 
i dx 


(3.244) 


Therefore, 



0 =► 


( 


d mu> \ 
dx + ~h X ) 


u o = 0 


(3.245) 


so that 


uq(x) = Cge 2h x where Co = any constant 


(3.246) 


All eigenfunctions for N - 0 have this form. Thus, the TV = 0 level is non¬ 
degenerate. This implies that the entire spectrum of Ed is non-degenerate. Now 
from an earlier result 


{u 0 | u Q ) = 1 



(3.247) 


Choosing Co to be positive real we have 


wo 



(3.248) 


One can easily obtain any energy state (un for N > 0) by applying the raising 
operator a + a sufficient number of times to the ground state. 


Note: a + UN has energy En + huj = Cjv+i where ujv has energy En = hu(N + 
1/2). But En+i is non-degenerate. Therefore, a + UN = AnUn+i where An is 
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some constant. |.Ajv| can be determined from the fact that un and un+i are 
normalized. 


( o + un | a + UN) = \Ajsr\ 2 {un +i | un+i) = |^4jv| 2 

=1 

(un | aa + uw) = {un | ([a, a + ] + a + a) Un) 



= (TV + 1) (un | un) = \An\ 2 
=1 

so that 

\A N \ = \/TV7l => a + un = \/N + Imat+i 
which specifies Un+i in terms of un • Therefore, 

aa + UN = V N + lauN+i 


and 


(k«*] + a*«)« = (£ + i)„ = (^ + i) 


= (TV + 1)ujv = VTV+ lartAr+i 


so that 

auN+i = V N + 1 un =► auN = '/Nun-i 


and, in particular, auo = 0. 


(3.249) 


(3.250) 


(3.251) 

(3.252) 


(3.253) 

(3.254) 


Note that a + auN = VNa + UN-i = '/N'/Nun = TVuat so that un is also an 
eigenfunction of a + a (which only differs from H by an additive constant). 


We can now find un(x) in terms of the ground state uq(x). We use 
a + UN = V N + 1un+i =► o + un-i = \/~Nun 


Thus, 


or 


Now 


1 + 1 + / 1 

un - un-i - —^=a —p= 

VTV n/TV \n/TV 


1 


1 


n/TV Vn/TV^I \v a /V^2 


1 


= a + mat -2 j 

))- 


=a wat-3 


1 / + \ AT 


/mw i 

a' =\/-r^x - p x 


2 h \J 2 mhuj 


(3.255) 


(3.256) 
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so that 


Un 



Now, as earlier, let 



(a dimensionless variable). Therefore, 


(3.257) 


un(x ) 



(3.258) 


which is an explicit formula for un- 

Note: 


/ A \N 

( a \ ly2 

/ d \ 

( d \ 

( d \ 

rd e 2 

r dy) 

1 

Ps 

1 

"r dy) 


N factors - each acts on everything which 
appears to the right o it 
_ 1 2 

= e (polynomial of degree N) 


(3.259) 


The coefficient of y N in this polynomial is 2 N . 


Proof: To obtain y N a factor of y must come from each (y - d/dy) term. For 
each term, the y can come from one of two places (from y or from -d(e~^ y )/dy. 
There are therefore, 2 N possible ways to obtain y N . 


We can use this result for Un(x) to obtain an explicit expression for the Hermite 
polynomials i7/v(f/)• From the differential equation method we found that 

u N {x) = C N H N {y)e-^ 2 (3.260) 


Comparing our two expressions for un{x) 


C N H N (y)e-i y2 



(3.261) 


We can avoid doing the integral (un \ un) = 1 to evaluate Cjv by recalling that 
H N (y) = polynomial of degree N with 2 N as the coefficient of y N . But 

(y-4~) e” 5 y2 = e '^ 2 (3.262) 

V ay) 
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(polynomial of degree N with 2 N as the coefficient of y N ). Therefore, 


and 


H 


NK 


1 I mui \ 
un{x) = —= 

Jm \ TTh ) 


/ d \ N i 2 


[y-dy) e " 5 

(3.263) 

\ 1/4 1 1 2 


) H N {y)e-* v 

(3.264) 


Comment on Parity: Un(x) is an even function of x for N even and an 
odd function of x for N odd (because H_\r(y) has this property). Thus, the 
eigenfunctions of H are either even or odd functions. We could have anticipated 
this result from the following discussion. 


The parity operator II is defined by its action on the wave function: 


Ihp(x, t ) = ip(-x, t) 

The parity operator II is linear and hermitian. Since 
II 2 ip(x,t) = II = ip(x,t) 


(3.265) 


(3.266) 


we have 


n 2 = / 


and since its eigenvalue equation 


Ihp = A ?/> 


implies 


nV = An^ = \ 2 ip = iip = ip ^\ 2 = i-*\ = ±i 


or the eigenvalues of II are ±1. We then have 


n Ipeven^X jt') — Ipeveni. X,t) — 1p even (x : t') 
Thpodd(x,t ) = ipodd(-x,t) = - ipodd(x,t ) 


(3.267) 

(3.268) 


(3.269) 

(3.270) 


so that any even function of x is an eigenfunction of II with eigenvalue +1 and 
any odd function of x is an eigenfunction of II with eigenvalue -1. 


Let us apply this result to the one-dimensional harmonic oscillator: 


IUIiP(x,t) = n 


h 2 d 2 1 2 

-rw + -mu x 

2 m dx 2 2 

h 2 d 2 1 22 

+ -mw x 


2 m dx 2 2 
= HHip(x, t) 


2 ip(x,t) 
ip(-x,t) 
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so that (since i/j(x,t ) is arbitrary) 


UH = HU => [n, H] = 0 (3.271) 

Therefore, we can find a CON set of simultaneous eigenfunctions of H and II. 
But the eigenfunctions of H are non-degenerate, and therefore, each un{x) is an 
eigenfunction of II also. This means that each un(x) obeys IIi(jv(a;) = ±un(x ) 
or that each un{x) is either an even function of x or an odd function of x. 


3.6.3. Use of Raising and Lowering Operators 


One can compute inner products of the form (un' \ A(x,p x )un) without doing 
any cumbersome integrals. Now 


a = 



p x and a + 

V 2 mhuj 



i 


\/ 2 mhu> 


Px 


imply that 

x = \J ^~“( a + a+ ) and Px = t yj ^|^(a - a + ) (3.272) 

Thus, A(x,p x ) can be expressed in terms of the raising and lowering operators, 
where 

aiiw = \/Nun -i and a + un - VN + Iuat+i (3.273) 

Example: Consider the stationary state 

ip{x, t) = upf(x)e~ l h (3.274) 

We have 

<^(t) I V>(t)) = 1 (3-275) 

and 


(x) = | xip(t)) = lu N e 1 h 


xune 1 h ) = (un | xun) 


2 mu> 


{un I (a + a + )u N ) = 


2 mui 


(( u N | au N ) + {u N | a + u N )) 


/ 


2 mu> 


\ 


VN {u N | un~i) +VN + 1 (^ | un+i) 


=0 


= 0 


=o / 
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so that (x) = 0 for the given stationary state. Now 


(x 2 ) = (' ift(t ) | x 2 ^(t)) = lu^e 1 h x 2 U]ye 1 h 


= ( 


UN X UN 


(ujv I (a + a + ) 2 uj\r) =- (mat | (aa + aa + + a + a + a + a + )uN) 

2 moj 


2 moj 


2 mu> 


(n/]V \/N - l) (un | UN- 2 ) + (VN + 1%/-/V + l) (mat | ww) 


=0 


h 

moj 


K) 


{VNy/N) (un I un) + (VN + 1\/ AT + 2) (mat | UN+ 2 ) 
=1 

E n 


=0 


Therefore, 


(A*) 2 = M - [x? - ^ 


Ax = 


E n 


It is quite reasonable for Ax to increase with En- Classically, 


771 1 2 2 

E=-moj x max 


Therefore, 


2 £ 


IBW' 


(3.276) 


(3.277) 


(3.278) 


(3.279) 


which increases with E. Quantum mechanically, one has (x) = 0 with a spread 
in positions of the order of x max . 


Example: Consider the non-stationary state 


1 


V>0r,0)= —u 0 (x) + \/-u 1 (x) 


(3.280) 


We use Hun = Enun, En = hw(N + 1/2), (mac | un) = 5/v'jv • These two wave 
functions are shown in Figure 3.6 below. 


mq(x) oc e 2h 


Mi (x) oc xe 2h 


(3.281) 
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Figure 3.6: Two wave functions making up the state 


Therefore, 

ip(x,t) = -^.u 0 (x)e~ l — + (3.282) 

so that 

im\m) = l + l = 1 

p(E 0 = 3huj/2,t) = | , p(E N>1 ,t) = 0 

, . . , . . . Tr , . .. ^2, 1 I hu>\ 2 / 3ftu; \ 

(iF) = (V’(t) | ffV’W} = E p(E N ,t)E N = - ^ —j + - ^ —j = — 


and 


(x) = 


-Eu 0 (x)e ‘"a* 

+ ^ix)^ 


~^u 0 (x)e *“ 2 * 

+^/§u i (x)e _ia 2 1 


1 2 
= - (u 0 (x) | xu 0 (x)) + - {u 0 (x) | xiti(x)) 

+ ^e iut ( Ul (x) | xu 0 (x)> + ^e~ iut (u 0 (x) | x Ul (x)) 


We must calculate (un> \ xu^). 


( u N ' | XUat) 



(u N r | (a + a + )uAr) 

((ujv' I au N ) + {u N / | a + u N )) 



>) 


(3.283) 
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This gives 

<*> = 0 + 0 + (e iwt + e~ l0Jt ) = l^P^coscjt (3.284) 

V 2mw 3 3 V mcj 

We have the probability distribution as shown in Figure 3.7 below. 



p<.t.O)oc|^(.x.O)| i 


Figure 3.7: Probability distribution 

Now an energy measurement is made at t = 0 and the value hu >/2 is obtained. 
The wave function collapses to 


1 


= —u 0 (x) 


(3.285) 

and we now have the probability distribution as shown in Figure 3.8 below. 



Figure 3.8: Probability distribution 


Immediately after the energy measurement, a position measurement indicates 
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that the particle is not on the negative portion of the x-axis at t = 0 (for 
example, one shines light on the region x < 0 at t = 0 and does not see the 
particle). The wave function then collapses to ip"(x,0). To calculate this new 
wave function, one must expand ip'(x,0) in terms of position eigenfunctions. 

We have xvx(x) = Xvx(x), where X = constant eigenvalue of position or 

(x-X)v x (x) = 0 (3.286) 

This implies that Vx{x) ~ S(x - X) is the position eigenfunction where 

oo 

(v x > | Vx) = J dx8(x-X , )S(x-X)=5(X , -X) (3.287) 

— oo 

and 

OO 

ip'(x, 0) = J dXCxVx(x) (3.288) 

— oo 

which is an expansion in terms of CON set {vx(x)} with 

oo 

C-x - (vx I = 0)} = J~ dx5(x - X)ip'(x, 0) = ip'(X, 0) (3.289) 

— oo 

Because the position measurement indicates that the particle is not in the x < 0 
region, the wave function collapses to 

OO 

ip"(x, 0) = J dXCxVx(x) (3.290) 

o 

which is the part of the wave function with position eigenfunctions for X > 0. 
Therefore, 

oo ( 

<//'(*,0)= f dX^(X,0)8(x-X)= ° (3.291) 

J [ip ( x , 0) for x > 0 

where 

ip'(x, 0) = —^uo(x) (3.292) 

V 3 

We get the probability distribution shown in Figure 3.9 below. 
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Figure 3.9: After the position measurement 


A position measurement just collapses the wave function to zero in those regions 
where the particle is known not to be! 

Note that 


oo 

= 0) | ip"(t = 0)) = y~ dxip"*(x,0)ijj"(x,0) 

— oo 

oo oo 

dxip'*(x, 0)ip'(x, 0 ) = - j dxu* 0 {x)uo{x) 
3 J 

o o 

1 /1 00 

3 I 2 J dxu 0 (x)u 0 (x) 

\ —OO 


= ^ («o I Wo) = ^ 



il>"(x, 0) is no longer an eigenfunction of energy. The probability of now mea¬ 
suring the energy to be Eq = hut/2 is given by 


p"(E 0 = huj/2,0) = 


(ip"(t = 0) | u 0 ) (u 0 | = 0)) 1 


(i/)"(t = 0) | ip"(t = 0)) 


1/6 


J~ dxv,Q(x)ip"(x,0) 


= 6 


= 2 


J~ dxuQ(x)i/)'(x,0) =6 dxu^{x)— j^ilq{x) 


1 °° 

2 f dxu* 0 {x)u 0 {x) 


= 2-(l) = 1 
4 V y 2 


Suppose that we make no measurements after the wave function has collapsed 
to ip"( x , 0). The wave function will then develop in time under the action of the 
harmonic oscillator force. Let us find 4>"(x,t/2) where r = 2ttoj is the classical 
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period of the oscillator. We have 


ip”(x, 0) = Yj C n u n (x ) 

N =0 

=>ip"(x,t) = Y CNUN(x)e~ l ~ Fr ~ = e -1 " 2 " Y CNUN(x)e~ lNult 

N =0 JV=0 


(3.293) 

(3.294) 


and 


C N = (u N | ip"(t = 0)) = J dxu* N (x)ip"(x, 0) 

— oo 

OO 1 00 

= dxu^(x)ip'(x, 0) = —— J~ dxu* N (x)uo(x) 


(3.295) 


This is very complicated in general (we cannot use orthogonality since that 
requires the integration range [-oo,+oo]. We can do the following however. 


For N even we can write 




OO oo 

r If r 1 

J dxu* N (x)u 0 (x) = —-J dxu* N (x)u 0 (x) = -^-y=S N0 (3.296) 


2\/3 


so that Cq = l/2\/3 and for all other even N the Cn are zero. We will not need 
an explicit formula for odd N values of Cn- 


We then have 


= e-^ Y C N u N (x)e~ iNu,t = 

N =0 


C 0 ‘Uo + Y C N u N {x)e 

JV=1,3,5. 


-iNcot 


(3.297) 


il>"(x, r/2) = e 12 


— r u0 + S Cnun(x) e zN * 

^ »-■«. 


= ~l 


^ u o-(-A=u 0 + X! C N u N (x) 

Vo \Wo AT=1,3,5,.... i 


—wo-(C' 0 wo+ X C N u N (x) 
V 3 \ AT=1,3,5,.... i 


ip"(x, 0) 


= -Z 


-^=u 0 - ip"(x,0) 


(3.298) 
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Therefore, 


iP"(x,t/2) 


i 

.71 


u o -if"(x,0) 


f ^ u 0 for x < 0 
[0 for x > 0 


(3.299) 


Therefore at r/2, the probability distribution has oscillated to the x < 0 region 
as shown in Figure 3.10 below. 



Figure 3.10: After the position measurement 
as we might expect for an oscillator! 

We now discuss general potential functions. 

3.7. General Potential Functions 

Consider a particle with potential energy V (i) so that 

H =?-^ + V(x) = -^-V 2 + V(x) (3.300) 

2 to 2 m 

The particle is completely described by a wave function tf(x, t) , where (if(t) \ if(t)) 
is finite and non-zero and where the time-development of the wave function be¬ 
tween measurements is determined by the time-dependent Schrodinger equation 

(3.301) 

at 

To study the behavior of such a particle, one begins by finding the energy 
eigenfunctions and eigenvalues: 

{ulf 1 ' 1 (x),u^\x)} - CON set of eigenfunctions of H 

Hui a) (x) = E n ufff\x) - discrete part of the spectrum 

Hu£> (x) = Ecv'ii'ff) (x) - continuum part of the spectrum 
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Knowledge of (x),uiv\x),E n ,E cv ^ not only gives the possible results 
of an energy measurement but also gives a simple expression for the time- 
dependence of any wave function for the particle: 

^OM) = £<&*>«(“>( x)e- iE ^+ f { X )e~ iE ^ (3.302) 

na ^ (3 

where 

c( n ) = ( u i a) | 4’(t = 0)} and 4 /3) = { u w\ x ) | = 0)} (3.303) 

This explicitly gives i/j(x,t) in terms of if){x, 0). 


Let us first consider a particle constrained to move along the a:-axis (one¬ 
dimensional problem). We have already discussed the one-dimensional harmonic 
oscillator. Now, we want to discuss a particle whose potential energy is given 
by: 


V(x) = 


VL(constant) 

Vn(constant) 

V (x) (arbitrary) 


for x < Xl 
for x > xr 
for xl < x < xr 


(3.304) 


as shown in Figure 3.11 below. 



We have the energy eigenfunction/eigenvalue equation: 

h 2 d 2 

Hcj) = - — -j-^(l)+V(x)cj) = E(l) (3.305) 

=> (V(x) - E) <j>{x) (3.306) 

3.7.1. Method for Solving this Eigenvalue Equation 

1. Solve the differential equation with E any arbitrary constant. 
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2. Determine the allowed values of E by requiring that the solution p does 
not become infinite as \x\ -* oo. This is a necessary condition for </> to be 
an acceptable energy eigenfunction - the inner product of <p with almost 
every square-integrable function must be finite. For some values of E, the 
solution to the differential equation will become infinite as |x| -» oo. Such 
a solution is not an acceptable energy eigenfunction and, therefore, the 
corresponding value of E is not an allowed eigenvalue. 

Notes: 

(a) The given differential equation implies that </> and dfi/dx must be continu¬ 
ous in regions where V ( x ) is finite (however, V(x) need not be continuous 
in such regions). These are necessary conditions for d 2 <f>(x)/dx 2 to be fi¬ 
nite, where the given differential equation implies that d 2 (p(x)/dx 2 must 
be finite for finite V(x) and finite E. 

More General Discussion (from Chapter 2) 

Since <p{x) is physically related to a probability amplitude and hence to a 
measurable probability, we assume that <p(x) is continuous. 


Using this fact, we can determine the general continuity properties of 
d<p(x)/dx. The continuity property at a particular point, say x = Xo, is 
derived as follows: 


Xo+e ,2 / / \ 
r d 2 <f>( x ) 

J dx 2 

Xo-S 



2m 

JP 


Xq+£ Xq+£ 

E cf>(x)dx - J V(x)<p(x)dx 

. Xq~£ Xq~£ 


Taking the limit as e -» 0 


lim 

£->0 


d<p(x) 


d<p(x ) 


dx I , dx 

\X=Xq+£ 


2m 

'Jfi 


Xq + £ 


Xq + £ 


i^lim (j)(x)dx - lim V{x)(f){x)dx 


or 

A (d^x)\2m l[ m jr ( 3 > 30 7) 

\ dx ) h 1 J 

Xq~£ 

where we have used the continuity of 4>{x) to set 


xo+£ 


lim 

£—>0 


J fi(x)d; 


x = 0 
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This make it clear that whether or not d(f>(x)/dx has a discontinuity de¬ 
pends directly on the potential energy function. 


If V(x) is continuous at x = Xg (harmonic oscillator example), i.e., 



lim [V(xq +e)-V (xo - £)] = 0 
£—>0 

(3.308) 

then 

/ \ Xq+£ 

a L ) = Iss / 

X 7 Xq~£ 

(3.309) 

and d(f>(x)/dx 

is continuous. 


If V (x) has a finite discontinuity (jump) at x = Xq (finite square well and 
square barrier examples later), i.e., 


lim [V(xq + e) - V(xo -£■)] = finite 
£—*0 

(3.310) 


and d(f>(x)/dx is continuous. 


Finally, if V(x) has an infinite jump at x = x 0 (infinite square well and 
delta-function examples later), then we have two choices 

(a) if the potential is infinite over an extended range of x (the infinite 
well), then we must force cf>(x ) = 0 in that region and use only the 
continuity of 4>{x) as a boundary condition at the edge of the region 

(b) if the potential is infinite at a single point, i.e., V(x) = (5(a; — cco), then 

a (^ i ) = ^?- m o/ VW0(x)<iX 

' XQ-£ 

2m X ° +£ 

= —— lim / S(x - xo)<j>(x)dx 
h- £-»o j 
x 0 -£ 

= jt lim^(x 0 ) = ^<j>(x 0 ) (3.311) 

h z £-*-o n z 

and, thus ,d(j)(x)/dx is discontinuous. 

We will use these rules later. 


(b) Because V(x) is a real function of x and because the allowed values of E 
must be real (they are eigenvalues of the hermitian operator H), R e<p and 
Im (j> obey the same differential equation as does (f> = Re<ji + ilm^i, i.e., 


d 2 (Re0 + ilmcj)) 
dx 2 


2 TO 

H 2 


{V{x) - E) (Re</> + ilmcj)) 


193 



implies that 



Re</> 

Im</> 


(c) In any region where V(x) > E , {Re</>, Im</>} and d 2 {Re(/>,lm.(j)} /dx 2 have 
the same sign. Thus, {Re</>, Im</>} must be concave upward if {Re</>, Im</>} 
is positive and concave downward if {Re</>, Inn/)} is negative - {Re</>, Inn/>} 
must curve away from the x-axis (non-oscillatory) as shown in Figure 3.12 
below. 


Retp or lm«p 



Figure 3.12: Behavior for V(x) > E 


Note that V(x) > E is the classically forbidden region where the kinetic 
energy would be negative. 

In any region where V(x) < E, {Re</>,Im</>} and d 2 {Regime/)} /dx 2 have 
the opposite signs. Thus, {Re(/>, Im</>} must be concave upward if {Re<^>, Im</)} 
is negative and concave downward if (Re<^>. Inn/)} is positive - {Re</>, Im</>} 
must curve toward the x-axis (oscillatory) as shown ib Figure 3.13 below. 


Re<p or Irrup 



Figure 3.13: Behavior for V(x) < E 
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Note that V(x) < E is the classically allowed region where the kinetic 
energy would be positive. 


(d) If V(x) is equal to some constant value Vo in a region, then phi(x) can 
easily be found in that region: 


E >\o ( oscillatory ) 


d 2 (/)(x ) 2 to . . , . 12m — 

r (Vi - E > 

-fe 2 ( negative ) 

<f>(x) = Ae ifca: + Be~ zkx = C sin kx + D cos kx 


i? < Vq (?ro?r - oscillatory ) 


d 2 (/)(x) 2m 12m ~ 

-^5-■ hrW-£> »(*> ■ 

+k 2 ( positive ) 

</>(x) = Ae fea: + Be~ kx = C sinh kx + D cosh fcx 


£ = V, 
d 2 4>(x) 
dx 2 


2 TO 

Tv 


(V 0 - £) 0(x) 


=o 

0(x) = Ax + H 


Let us now consider some properties of the spectrum of H when the potential 
energy V(x) has the general form given earlier 


V(x) = 


VL(constant) 

Vn(constant) 

V (x) ( arbitrary ) 


for x < Xl 
for x > x a. 
for xl < x < Xfl 


For definiteness, let Vl < Vr. For any constant E, the differential equation for 
< f>(x ) has two linearly independent solutions (because the differential equation 
is second order). However, the extra condition that </>(x) not become infinite as 
|x| -»■ oo may limit the number of acceptable solutions for that E - for a given E, 
there may exist two linearly independent acceptable solutions, only one linearly 
independent acceptable solution, or no non-zero acceptable solution. 
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For E > V R > Vl we have 

x < x L =► V (x) = V L 

=> 0(x) = Ae ifet3; + where = +^J 2 ^{E-V L ) (3.312) 

x>xr=>V(x) = Vr 

=> 4>{x) = Ce ikRX + De~ ikRX where jfe fl = +yJ^-(E-V R ) (3.313) 

That the given differential equation has two linearly independent solutions 
means that <f>(x), x € (-oo,+oo) has two arbitrary constants. Choose these con¬ 
stants to be C and D. Then A(C,D ) and B(C,D ) will be functions of C and 
D, which are found by solving the differential equation. The extra condition, 
4>(x) does not become infinite as \x\ -* oo puts no restrictions on the solutions 
we have written. Thus, C and D can still be arbitrary. Therefore, there ex¬ 
ists an acceptable <j>(x) for any E > Vr > Vl and it depends on two arbitrary 
constants - there is 2-fold degeneracy for any E > V r >Vl- The corresponding 
eigenfunctions are clearly non-normalizable ( continuum eigenfunctions). 

For Vr > E > Vl we have 
x < x L => V (x) = V L 

=> <t>(x) = Ae kLX + Be- kLX where k L - +\J(V L - E) (3.314) 

x > XR => v(x) = Vr 

=> <!){x) = Ce kRX + De~ kRX where k R = (Vr - E) (3.315) 

Choose C and D as the two arbitrary constants which enter into the solution 
of the given second-order differential equation. A(C,D) and B(C,D ) will be 
functions of C and D. The extra condition, <j)(x) does not become infinite as 
|x| -*■ oo requires that C - 0 and B = 0. If a non-trivial, acceptable solution to 
the differential equation exists, then D must be able to take on any arbitrary 
value. This follows from the fact that the given differential equation is homo¬ 
geneous: if 4>(x) is an acceptable solution, then any constant times (f>(x) is also 
an acceptable solution. Thus, C = 0 and B(C = 0 ,D) = 0 with D arbitrary can 
be satisfied. These would correspond to the allowed values of E, and the cor¬ 
responding eigenfunctions would be non-degenerate (one arbitrary constant D) 
and normalizable since |^(x)| decreases exponentially as x ±oo, which implies 
that (<f> | (j)) is finite. Whether or not these allowed values of E exist depends on 
the V(x) considered. 

Note: E is not allowed if E < V(x) for all x e (-oo,+oo). 
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Proof: We have 


d 2 f Re</> 
dx 2 \ Im0 



(V(x)-E) 


Re(f> 1 
Im</> J 


positive for all x 


(3.316) 


Therefore, d 2 (Re(/), Inn/)} /dx 2 and {Re(/>, Im</>} have the same sign for all x, 
which implies that {Re</>, Im</>} always curves away from the :r-axis. The ac¬ 
ceptable solutions have 


E <V L - 
E<V r 


Re</> 

Imt/> 

Ret/i 

Imt/i 


and 

4>(x) oc e khX 

for x < Xr 

(3.317) 

and 

(j>(x) oc e ~ kRX 

for x > x R 

(3.318) 


As can be seen from Figure 3.14 below, it is clearly impossible to join the 2 parts 
of (Ret/), Im</>} at xr such that the function is continuous and has a continu¬ 
ous first derivative. Thus, such values of E are unacceptable. For an allowed 
value of E, it is therefore necessary to have some region of x in which E > V ( x ). 


r Re«p, r Reip l 

l lm<p / 1 imtp • 



Figure 3.14: Impossible to join 


Summary: (these degeneracy statements apply only to one-dimensional 
motion along a straight line) 

E>V r >V l 

All such E values are allowed. 

Each E is 2-fold degenerate 

4>(x) is oscillatory for x < Xl and for x > xR 

4>{x) is non-normalizable ( continuum) 
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Vr> E >V L 


All such E values are allowed. 

Each E is non-degenerate 

<f>(x) is oscillatory for x < xr and exponentially decreasing for x > xR 
<f>(x) is non-normalizable ( continuum ) 

E<V l <V r 

Only certain E values are allowed(it is possible for no E values to be allowed 
in this range) 

Each E is non-degenerate 

4>{x) is exponentially decreasing for x < Xr and for x > xR 
<fi(x) is is normalizable ( discrete) 

Eigenfunctions in this energy range are called bound states 
Examples: Examples are shown in Figures 3.15 and 3.16 below. 


V(x) 


Allowed E values 


n 




non-degenerate 

continuum 


2-fold degenerate 
continuum 


X 


Figure 3.15: Example $=1 
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V(x) 


Allowed E values 



M: 


. 2-fold degenerate 
continuum 

non-degenerate 
continuum 

possible non-degenerate 
bound states (these may 
or may not be present) 


Figure 3.16: Example =^2 


Piecewise Constant Potential Energies 


In one-dimension these are particularly simple to analyze. Consider the poten¬ 
tial shown in Figure 3.17 below. 


V(x) 






Piece 

V(x) 

wise Constant 

— u 




X 


Figure 3.17: Piecewise Constant Potential Energy 
Given the differential equation 

-m*)-*)*(*) (3.319) 

we have the following method: 

1. Solve the differential equation with E any arbitrary constant. This is most 
easily accomplished as follows: 

(a) Solve the differential equation in each region for which V(x) is a 
constant (discussed earlier). Special case: <p = 0 in a region where 
V (x ) = oo. 
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(b) Match the solutions across each boundary: <j> and d(f>/dx are contin¬ 
uous. Special case: only <fr is continuous across a boundary with an 
infinite jump in V(x). 

2. Determine the allowed values of E by requiring that the solution <f> to not 
become infinite as |x] -> oo. 

The following example will illustrate the method. We consider the potential 
function shown in Figure 3.18 below. 


V(x) 



Figure 3.18: Step Function Potential - Infinite Barrier 

Our general results completely determine the spectrum (see earlier discussion). 
However, let us calculate it explicitly. 

We have F x - -dV/dx = 0 except at x - 0. Classically, E < 0 is not allowed. 
0 < E < Vo has unbound motion confined to x < 0. E > Vq has unbound motion 
over the entire x-axis. 


E< 0 

x<0=></> = </> 2 = Ae k2X + Be~ k2X (3.320) 

x > 0 =>- cj> = (j>\ = Ce kix + De~ klX (3.321) 


where 


k 2 = 



and ki 


2m(V q - E) 


h' 2 


(3.322) 


(f> must not become infinite as |x| -»■ oo, which implies that B = C = 0 so that 


= Ae k2X and^i = De~ klX 


(3.323) 


Then 


0i(O)=0 2 (O )^D = A 


d<j> i(0) 
dx 


dfcj 0 ) 

dx 


=> -k\D - k 2 A => -k\ = k 2 
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which is impossible. Therefore, there are no allowed values for E < 0. 


where 


0 < E < Vo 

x < 0 => <fi — (f >2 = Ae lk2X 

; + Be~ ik2X 

(3.324) 

X> 0=><j) = fa = Ce klX 

+ De~ klX 

(3.325) 

- a l 2mE and ki - A / 

'2m(V 0 -E ) 

(3.326) 

~\j h 2 and /c , - y 

h 2 


( i> must not become infinite as |x| -> oo, which implies that C = 0 so that 


</> 2 = Ae ik2X + Be~ ik2X and fa = De~ klX (3.327) 


Then 


fa(0) = <fa(0)=>D = A + B 


dfa ( 0 ) = #2(0) 

dx dx 


—k\D = HZ 2 A - ifaB 


These two equations determine A and B in terms of D, which is arbitrary. This 
is possible for any E. Therefore, all E values for 0 < E < Vo are allowed and 
they are non-degenerate (one arbitrary constant D). 


E>V 0 

x <0^(j) = fa = Ae ik2X + Be~ ik2X (3.328) 

x>0^>cj) = fa = Ce iklX + De~ iklX (3.329) 


where 


ki = 


2 mE 


and k\ = 


2 m(E-Vo) 


h 2 


(3.330) 


(f> must not become infinite as |x| -*■ 00 places no restrictions on A, B, C , D. 
Then 


fa(0) = fa(0) =>C + D = A + B 


dfa(0) = d(j) 2 (0) 
dx dx 


-iki(C - D) = ik2(A - B) 


These two equations determine A and B in terms of C and D, which are both 
arbitrary. This is possible for any E. Therefore, all E values for E > V b are 
allowed and there is 2-fold degeneracy (two arbitrary constants C and D). 


These results agree with our analysis of the general V(x). 


Notice that the potential energy just considered has some physical significance: 


201 



metal 



x=0 


Figure 3.19: End of a wire 


(a) Consider an electron constrained to move along the x-axis. See Figure 
3.19 above. 


The electron has no force acting on it when it is in the vacuum region. 
Furthermore, the electron is essentially free in its motion through the 
metal. However, an energy Vo (the metal’s work function ) is needed for 
the electron to leave the metal’s surface and enter the vacuum region. 
Thus, 


V(x) 


0 for x < 0 
Vq for x > 0 


(3.331) 


as in the example above. 


(b) Consider an electron constrained to move along the £-axis. The electron 
moves within 2 conducting cylindrical tubes which are separated by a 
small gap and which are maintained at different potentials (the x > 0 tube 
is at electrical potential -<f> with respect to the x > 0 tube). See Figure 
3.20 below. 


V 


H 1- 1 

- GZ33-GZZB- - 


x=0 


Figure 3.20: Gap between tubes 


The potential energy is 


V(x) 


0 for x < 0 
Vq for x > 0 


(3.332) 


where Vo = -e<f> where e = the electron’s charge. Again, this is the same 
as in the example above. 


3.7.2. Symmetrical Potential Well (finite depth) 

We consider the potential energy function shown in Figure 3.21 below: 
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V(x) 



Figure 3.21: Finite square well 


so that 


(Vo 


V(x) = • 


0 


(Vo 


for x > a/2 

for -a/2 < x < a/2 

for x < -a/2 


(3.333) 


The energy spectrum has a 2-fold degenerate continuum for all E > Vo- There 
are no energy eigenvalues for E < 0. The question is whether or not there are 
bound states in the region 0 < E < Vq. 


Note: We have V(x) = V{-x). Recall that the parity operator II is defined by 
Hf(x) = /(- x). The eigenvalues of II are ±1 and the corresponding eigenfunc¬ 
tions are (even, odd) functions of x. 

Notice also that [II, H ] = 0. 

Proof: We have 

( ti 2 d 2 \ 

UHf(x) = H\^- — — + V(x)jf(x) 

■{~Lih +v(x) ) nf(x) 

so that HHf(x) = Hlif(x) for any f(x) or [II, H] = 0. 



Thus, we can find a CON set of simultaneous eigenfunctions of II and H - each 
eigenfunction in this set will be either an even function of x or an odd function 
of x. This observation will simplify our calculation - we need to match <p and 
d(/>/dx at x = a/2 only. An even or odd function of x will then automatically 
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have 0 and d(f>/dx at x = -a/2. 
0 < E <Vq: we define 


ko = 


2 mE 
~h r ~ 


and fci = = 


2to(Vq - £) 


ft 2 


(/>e„en: we have 


x > a/2 =>4> = fa = Ae +klX +Be~ klX 

A= 0 for (f> 
not infinite 
as |£c|->oo 

- a/2 < x < a/2 => 0 = fa = C sin fez +Dcosk 2 x 


C =0 for an 
even function 


0i(a/2) = 02(a/2) => B e - fel ° /2 = D cos ^ 

#iW 2 ) _ #2(»/2) = . We - W = -fc 2 D gin ^ 
dx dx 2 


(3.334) 

(3.335) 

(3.336) 

(3.337) 

(3.338) 


Therefore, B,D ±0 implies that we can divide (3.338) by (3.337) and get 



(3.339) 


E must obey the above transcendental equation for (f> even to be an eigenfunction. 


4>odd'- we have 


x > a/2 => 0 = 0 != Ae +klX +Be~ klX 

(3.340) 

A= 0 for 4> 
not infinite 
as |:r|->oo 


- a/2 < x < a/2 =► 0 = 0 2 = Csink 2 x + Dcosk 2 x 

(3.341) 

D= 0 for an 
odd function 


Map) - Map) -> Be - k '“ 12 - Csin 

(3.342) 

#,(a/2) _ dMap) ^ = t2Ccos 

dx dx 2 

(3.343) 


Therefore, B,C * 0 implies that we can divide (3.343) by (3.342) and get 

-hi = k 2 cot (3.344) 

E must obey the above transcendental equation for 0 o( m to be an eigenfunction. 


204 



Therefore we have 



Revert * 

, a , a, k 2 a 

k\ — - k 2 - tan- 

2 2 2 

(3.345) 


( frodd ' 

, a .a k 2 a 

-fci - = k 2 - cot- 

2 2 2 

(3.346) 

We let 

rj = 

a / 2 mE a 

t2 rv A 2 2 

(3.347) 

and then 





a a 
kl 2 ~ 2\ 


(3.348) 


1 h 2 ^ 

where 


_ mVoa 2 

2h 2 

(3.349) 

Thus, 





4>even : \/r - if = y tan 77 (3.350) 

<t>odd ■ V r - r) 2 = -77 cot 77 (3.351) 


These equations determine the allowed values of E for the two types of solutions. 
We can solve these transcendental equations graphically. We plot y = 77 tan 77 
and y = -77 cot 77 . The intersections of these curves with the curve y = yjT - rj 2 
yields the allowed values of 


77 = 


2mE a 
2 


K 2 


(3.352) 


and thus the allowed values for the even and odd solutions. Now E <V q =► 
rj 2 < T so that y = +y / T - rf 2 is one quadrant of a circle of radius T. Note also 
that 7 / is always positive. The graphical solutionis shown in Figure 3.22 below. 

Comments: 


1. There always exists at least one bound state for the symmetric potential 
well, regardless of the value of T. Asymmetric wells need not have any 
bound states. 

2. The bound state energies are indeed discrete (separated). 

3. T finite implies that there are a finite number of bound states. 

4. Let Vq -*■ 00 (a fixed). Therefore, T -» 00 and the circle y = +\/l r -rf 
has an infinite radius. In this case, the circle intersects y = ?yta 1177 and 
y = -ycotri at the asymptotes (dotted vertical lines in Figure 3.22). Thus, 
the allowed values of y are 

7r 3i 5?r 7 i 
77 5 ^ 1 5 2 tt, ——, 3 tt, ——, 
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Figure 3.22: Graphical Solution 


(infinite number of bound states). Therefore, 


V = 


2 mEa n _ 

—--= n— => E = 

h 2 2 2 


n 2 ir 2 h 2 
2 ma 2 


(3.353) 


which agrees with our earlier result for the potential well with impenetra¬ 
ble walls (infinite square well). 

5. Let N be the number of bound states present. We then have 

(N-l)^<Vf<N^ (3.354) 

is the condition for exactly N bound states to exist. Note that the number 
of bound states present depends on the combination V^a 1 - a wide and 
shallow well can have as many bound states as a narrow and deep well. 

6 . The ground state is an even function of x. The first excited state (if it 
exists) is an odd function of x. The second excited state (if it exists) is 
an even function of x. Obviously, this alternation between even and odd 
functions of x continues to hold for all bound states. 


7. The ground state is always present for 0 < E < Vq. We have, for the ground 
state 

r 2 h 2 


, a 

ko- = 


2 mE a 


h 2 2 =??< 2 


£o = 


2ma 2 


(3.355) 
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and 


t +k\x 


(3.356) 

(3.357) 

(3.358) 


x < -a/2 =></>= </>3 oc e 
- a/2 < x < a/2 => </> = <f >2 °c cos fea; 
a; > a/2 =></>= c/>i oc e~ felX 

Therefore, (/> has no zeroes. The wave function is shown in Figure 3.23: 



non-zero probability to be In 
classically forbidden region 


Figure 3.23: Ground state wave function 


3.8. General One-Dimensional Motion 

We again consider the potential 
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with 


V{x) = - 


VL(constant) 

V (x) (arbitrary) 
Vn(constant) 


for x < xr 
for xl < x < xr 
for x > Xr 


and we choose Vr > Vl for definiteness. We found earlier that 


E > Vr > Vl - 2 - fold degenerate continuum 

Vr > E > Vr - non - degenerate continuum 

Vr> Vr > E - non - degenerate discrete states (if they exist) 


(3.359) 


(3.360) 

(3.361) 

(3.362) 


The following eigenfunctions of H form a CON set, where the constants A ^, B 
and Ar, Br, Ar, Br depend on E. 


E > Vr > Vr : Ugg(x) with a = 1,2 are 2 linearly independent eigenfunctions 
with same E (subscript C stands for continuum). With 


= ~ Vl ^ and kR = ~ Vr> 




+ e ~ik L x 

for 

X < Xr 

a CE 

(x) = 

J^ 1 ) e ik R x 

R 

for 

x > Xr 



complicated 

for 

Xr <X <Xr 



■ B {2) e - ikLX 

for 

X < Xr 

(2) 

U CE 

(x) = • 

A ( jp e ikRX + B™ e- ikRX 

for 

V 

& 



complicated 

for 

Xr < X < Xr 


(3.363) 


(3.364) 


(3.365) 


These particular linearly independent eigenfunctions have a simple interpreta¬ 
tion (see Figure 3.25 and Figure 3.26 below). 


«cbW : 


B V 


\ 

reflected 


incident 

\ 


/ 

^ ^ trentmitted 


Figure 3.25: u^ E (x) 
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and 


«£(*): 


B m e -^ 




\ I 

iwKkfit 


\ 


Figure 3.26: u^? F {x) 

V r >E>V l : uce(x) - no degeneracy. With 


kh = and kR = vj^( Vr ~ E ) 


we have 


uce(x) = 


A L e lkLX + B L e~ lkLX ioix<XL 
■ B R e~ kRX for x> xr 

complicated for xl < x < xr 

This eigenfunction has a simple interpretation (see Figure 3.27 below). 

u CE (x): 


B L e 


-iki.x 


B R e- k * x 


\ ! 
\ 


no* ouAfllory 


A L e' k 


Figure 3.27: uce(x) 


(3.366) 


(3.367) 
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V r >E>V l : uce(x) - non-degenerate bound states (E n ). With 


&z, = 



(V L - E) and k R = 



(V r -E) 


(3.368) 


we have 


Un{x )= 


A l e kLX 

B R e~ kRX 

complicated 


for x < x R 
for x > x R 
for xl < x < x R 


(3.369) 


This eigenfunction has a simple interpretation (see Figure 3.28 below). 


u n (x): 


B R e~ k « x 

i 

OHlMcfQfV I ) IHKI OMlUMfy 


A ,e 


kr x 


Figure 3.28: u n {x) 


Note: Solving the time-independent Schrodinger equation for all x will relate 
the A and B coefficients appearing in a given eigenfunction. 

Normalization: The following normalizations are chosen: 

{u n ' | u n ) = 6 n ' n 
{UCE’ | UCE ) = S(E' - E) 

{uce'\^)=5(E'-E) (3.370) 

that is, the continuum eigenfunctions are chosen to be normalized to (5-functions 
of the energy. 

Let ip(x,t) be the wave function for a particle (ip(x,t) must be normalizable). 
Now 

ip(x,t)=u n ( x)e~ l ~ fr (3.371) 

is a possible wave function for a bound state (we must have (ip(t) \ = 1). 
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However, 




uce(x) 

u ce( x ) 



(3.372) 


is not a possible wave function because uce and u 


(«) 

CE 


are not normalizable. 


Consider, however, the wave function constructed as follows: 




1 

71 


Eq+A , 

I 

E 0 1 


uce(x) 

u ce( x ) 



(3.373) 


where we assume that A is small. For uce we have Vr > Eq > Vl and for 
we h ave Eq>Vr>Vl- The integral corresponds to equal weighting of the 
energies in the range [Eq , Eq + A]. 


This ) (using uce, u< ce or u ce) normalizable and therefore a possible 

wave function. Considering uce for definiteness, we have 


(V’(i) I V’(O) = f 

— oo 

1 

■ A 
1 

“ A 
1 

“ A 
1 

' A 


Eq + A 


Eq+A 


dx—j= f dEu*cE{x)e +l h —= f dE' uce'(x) 
vA J VA J 

iA o -C/0 


_ E t 

e 1 h 


Eq+A 


Eq + A 


J dEe +l h dE' e 1 h dxu* CE {x)ucE'{x) 


E 0 
Eq+A 

J dE e + 

E 0 

Eq+A 


Eq 
Eq + A 


J dE' e 1 T (u C E I UCE') 


Eq 
Eq + A 


J dEe +i t J dE' e~ 1 ^6{E' - E) 

Eq+A 


Eq 


Eq 

Eq+A 

/ . a 

Eq Eq 


+iM -iBi 1 


A 


/ *E--- 1 


for all time. 


Of course, there are other possible wave functions. However, these particular 
wave functions have a simple physical significance and can be prepared experi¬ 
mentally (as we will shortly see). 


Note: We have 


lim —= 
°° a/A 



uce(x) 
(ck) / \ 

u ce( x ) 


i f 


0 for any a: 


(3.374) 
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The reason for obtaining zero is that 


.-Et Et . . Et 
e ft = cos —— i sin — 
h h 


(3.375) 


oscillates very rapidly as E varies over [E 0 ,E 0 + A] when t is infinite. These 
oscillations of the integrand as we integrate from Eq to Eq+A lead to alternating 
positive and negative values which cancel. In addition, \ ip(t)) = 1 or all 

t (including t ±oo). Thus, in some sense, ip(x,t = ±oo) must be non-zero 
somewhere. The only possibility is lim t ^ ±00 ip( x , t) + 0 when \x\ -*■ oo. Thus, 
this ip(x, t ) represents an unbound state - the only place there is a non-zero 
probability of finding the particle when t -*■ ±oo is \x\ -»■ oo. Now for definiteness, 
let us consider u^ E . We have 


ip(x,t) 


1 

71 


Eo+A 

J dEu^ E (x)e~ l ~ fr 


E 0 


Eq>Vr> Vl 


For x < Xl : 


ip(x,t) 


Eo+A 

— J dE {A^\E)e ikLX + B£ ) (E)e- ikLX }e~ i % 

V E 0 


(3.376) 


(3.377) 


For x > Xu : 


Eq+A 

il>(x,t)=-j= f dE 


(3.378) 


For xl < x < xr, ij){x,t) is complicated and depends on the detailed form of 
V{x). 


Now define the following functions (the earlier Figure motivates these defini¬ 
tions): 


s 0 +A 

incident (X, t) = — f dE [A™ {Ey kl -*} (3.379) 

Eq 

Eq + A 

Aeflected(x,t) = — f dE {b^ {E)e~ ik - X } e*™ (3.380) 

Eq 

Eq + A 

Aransrmtted(x,t) = — J dE {A^e ik * X ) f (3.381) 

Eq 

We let the above equations define the functions 'tpinci4 , refi,4 , trans for all x and 
t. 
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Now 


so that 



(3.389) 

(3.390) 


Now let 

,, , hk LR p LR 

Pl,r = nk L ,R and v l ,r = 

m in 

(3.391) 

so that 

dkL, R 1 

dE hv L ,R 

(3.392) 

Also define 

Vlo = ( vl)e=e 0 and V R0 = ( vr)e=e 0 

(3.393) 

Thus, 

dcSp 1 t 

ipinc(x,t) *0for + x »0 

ah hvL h 

(3.394) 


e 0 
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that is, for 


x * v L0 


t — h\ 


da 


(i)' 


dE 


’ E 0 J 


or the associated particle moves in the +x direction with speed vlo- 
Similarly, 


ip re fi(x,t ) + 0 for 


dP ( L 1] 

dE hi’L h 


1 t 
-x - — 


0 


-I Bo 


that is, for 


x * -n_Lo 


t - h 


' d/3 [ ir 
dE 


’Eg J 


or the particle moves in the -x direction with speed u^o- 
Finally, 


1ptrans(x,t ) * 0 for 


da 


(i) 

it 


1 t 

+-x- 

dE hv r h 


-I £ 0 


that is, for 


x * v R0 


t - hi 


da 


(i)' 

R 


dE 


B 0 J 


or the particle moves in the +x direction with speed vlo- 


(3.395) 


(3.396) 


(3.397) 


(3.398) 


(3.399) 


3.8.1. Physical Significance of form of ip(x,t) 

(a) The wave function has energy approximately equal to Eq with a small 
spread A. 

(b) Let t-*■ - oo. Then 

Ipinc * 0 for X -»■ -OO 
Iprefl * 0 for X -» +00 
Iptrans * 0 for X -► -OO 

which implies that 

for X < Xr 1p(x, t) — Ipinc(X) t) + ^ ref l (*£> 

t—>—oo 

for X > Xr ^(x,t) =^trans(x,t) "» 0 

t—>—oo 

Thus, lim t ^_oo ip(x, t ) is a localized wave function located far to the left of 
Xl and traveling to the right at speed (the incident wave). 
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(c) Let t -* +oo. Then 


V’inc * 0 for x -»■ +oo 
i’refl * 0 for X -*■ -oo 
i’trans * 0 for X +00 


which implies that 

for X < XL 1p(x,t) =ll>i nc (x,t) + 1prefl(x,t) 1p r efl(x,t ) 

t—»-+oo 

for X > = '&inc(%’) &) + t) —* 1prefl(,X]t^ 

£-»+oo 

Thus, lim t ^ +00 ip(x, t) is two localized wave functions one located far to 
the left oi xl and traveling to the left at speed vlo (the reflected wave) 
and the other located far to the right of xr and traveling to the left at 
speed vrq (the transmitted wave). 

As noted earlier, lim t ^ ±00 ip(x,t) = 0 for xl < x < xr. Figure 3.29 below shows 
ip(x,t) for t -*■ ±oo. 


V(x) not constant 




X 


V(x) not constant 



Figure 3.29: Wave Packet Evolution 


The particle does not break into 2 particles when it reaches the region [ xl,xr\. 
Rather, the particle, incident from the left at t -* -oo, can be found at x = +oo 
or at x = -oo when t -* +oo; ip trans determines the probability of transmission 
to x = +oo while iprefi determines the probability of reflection back to x - -oo. 

The transmission coefficient 3 = the probability of finding the particle at 
x = +oo {x > xr) as t -»■ +oo. 
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The reflection coefficient 94 = the probability of finding the particle at x = -oo 
(x < Xl) as t -* +oo. 


OO 

f dxifj* (x,t)ip(x,t) 

Cfl_ 


XL 

f dxip* (x,t)ip(x,t) 


(3.400) 


Before calculating 94 and 3, let us note the following useful relations: 
(a) We have 


/ 


d * ^±i(k'-k)x 
2ir 


S(k'-k) 


(3.401) 


(b) Let 


1 


Eq +a 


g(x,t) = -j= J dEe ±ikx e-^C(E) 
V E 0 


where A is small and 


dk 1 
dE hv 

We will encounter integrals of the form 


J dxg*(x,t)g(x,t) 


(3.402) 

(3.403) 

(3.404) 


We have 

oo 

J dxg*(x,t)g(x,t ) 

— oo 

oo E KJ . 

'/* 7 / 


Eq + /\ 


dE e fikx e i ^ C*(E) 


1 

A 

1 

A 


Eq 

Eq + A Eq + A 


1 

71 


Eq +a 

/ 

Eq 


LL/Q-r^. 

J dE' e ±ik ' x e- iS fEC{E') 


J dE J dE' C* (E)C(E')e i ^ J dxe ±i{k '~ k)2 

Eq Eq -oo 

E/o+A Eq+A 

J dE J dE'C*(E)C(E')e i ^e- i ^ L 2n6(k' - k) 


Eq Eq 


Note that 


J dE'F(E, E')S(k' - k) = J dk' F(E,E')6(k' - k) 


A dk ') 


f(e,e') 


k'=k 


(§)«*■*> 
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so that 


E 0 +A Eq +A 


J dxg*(x,t)g(x,t)= ^ J dE J dE'C*(E)C(E , )e i % i $r^2ir8(tf-k) 

-oo So So 

So+A So+A / ,\ ' 

J dE J dk'[ — \c*(E)C(E')e l! £e- lE t 1 5(k'-k) 

E 0 E 0 V ’ / 

So+A 

J dE il;) c ‘ (E)c(E) * 


27r 

A~ 


27T 


j Ml —j Ml 

)e h e h 


"~K~ I dEv \ C ( E ^\ 2 * ( z; ° |C'(^'o)| 2 A) = 2TThv 0 \C(E 0 )\ z 


E 0 
S 0 +A 


2irh 


So 

Now we can calculate 94 and 3 directly 

oo 

1 = (ip(t) | ^(i)) = J dxip*(x,t)ip(x,t) 


(3.405) 


(3.406) 


XL 

94= lim f dx ip* {x,t)ip{x, t) 

t —► + OO J ^ , 

00 ^tefi(x,t)^ ref i(x,t) 

XL oo 

= lim / dxip* efl (x,t)ip ref i(x,t) = lim dxip* efl (x,t)ip refl (x,t) 

£—►+00 ^7 J £—>+oo _/ J 

— oo —oo 

where the last limit replacement xl -*■ +°° is valid because iprefi + 0 only for 
x -* -oo(as t -*■ +oo). Therefore 


94= t linr^ J dxip* efl (x,t)ip re fi(x,t ) = 2irhv L0 (-ZT 0 ) | 

— OO 

Similarly, 

oo 

3= lim / dx ip* (x,t)ip(x,t) 
t-f + oo J „_ / 


(3.407) 


*l ; *rans( x ’ t )' , l’trans(x,t) 


OO oo 

= lim / dxipl rans (x,t)lptrans(x,t) = lim / dxipl rans (x,t)lptrans(x,t ) 

£—>+oo J £—>+oo _/ 

-oo 

where the last limit replacement a;# -* -oo is valid because iptrans 0 only for 
a; ->• +oo(as t ->• +oo). Therefore 

oo 

3 = ( Km J dxipp rans (x,t)iptrans(x,t) = 2Tvhv R0 \A^(E 0 )\ (3.408) 
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Note that (3.406) is valid for any t. We can evaluate this integral at t = -oo and 
at t = +oo to obtain relationships obeyed by A^\ A^\ 


J 

—oo —oo 

oo 

J dxipi nc (x,t)ip inc (x,t) = 2nhv L0 \A^(E 0 )\ (3.409) 


where the last limit replacement xr -*■ +oo is valid because ipi nc + 0 only for 
x ->■ -oo(as t ->■ -oo). 

OO X£, OO 

1 = J dxi/j*(x,t)l/j(x,t) J dxi/j* efl (x,t)i>refl(x,t) + J dxipl rana (x,t)lptrans(x,t) 

-oo -OO fCft 

oo oo 

= f dxr re fi(x,t)^ ref i(x,t) + f dxr trans {x,t)i> trans (x,t) 

—oo —oo 

= 27 Thv L0 \b£\e 0 )\ 2 + 2 nhv R o \a£\e 0 )\ 2 (3.410) 

where the last limit replacement xr -*■ -oo is valid because iptrans ^ 0 only for 
x -* +oo(os t -*■ +oo) and the last limit replacement Xr -*■ +oo is valid because 
V’re/i 0 only for x -*■ -oo(as t -*■ +oo). 


Summarizing we have: 

= 2nhvLo |^i 1) (^'o)| 2 (3.411) 

d=27rftu flo |^ 1) (^o)| 2 (3.412) 

l = 27rftu L o|4 1) ( s o)f (3.413) 

1 = 2 irhv L o |4 1} (^o)| 2 + 2 nhvR 0 |^^(^o)| 2 (3.414) 

When one invokes the time-independent Schrodinger equation for 1 one 
obtains and 4P in terms of A^\ A^ is usually chosen as the arbitrary 
constant one gets when solving the differential equation for u^? E . 


We then impose 


\A 


(i)I 2 _ 


1 


2nhvL0 


by requiring that (ip(t) I V’(O) = 1- Then we have 

K 1} (£o) 


< R = 2nhv L0 \B ( L 1 \E 0 )\ 


4 1} (^o) 


(3.415) 


(3.416) 
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and 


where 


Thus, 


where 


(i) 


3 = 2irhv R0 \A$(E 0 )\ 
vro km 


2 vro 
vlo 


44o) 


because v = 
vlo k L o to 


A { l\Eo) 

hk 


b { r\e 0 ) 

2 

~ kRO 

a ( r\e 0 ) 

A { l\e 0 ) 


’ J ” 1 

kLO 

4 1} (^o) 


U CE 

We also have that 


91 = 


A W e ik L x + B W e -ik L x 
^(!) e ik R x 


for X < XL 
for x > XR 


1 = 2-rthvLo |4 1) (- E 'o)| + 27T hvR0 |^4^ ) (^o) | 

i4 i} (^r 


b { l\e 0 ) 

2 

VRO 

a { l\e 0 ) 

vlo 


A { l\Eo ) 


= 91 + 3 


(3.417) 

(3.418) 


(3.419) 


(3.420) 


(3.421) 


which simply says that the probability of finding the particle somewhere when 
t -* + oo is unity. 


One can show that 91 + 3 = 1 directly from the time-independent Schrodinger 
equation. 


Proof: Let u = u^? E . Then 


h 2 d 2 u 
2 to dx 2 


+ V(x)u = Eu 


Multiplying by u* we have 


h 2 * d 2 u 

2m dx 2 


+ V(x)u*u = Eu* u 


(3.422) 


(3.423) 


The complex conjugate of this equation (V and E are real) is 


fi 2 d 2 u* 
2m dx 2 


+ V(x)u*u = Eu* u 


(3.424) 


Subtracting (3.424) from (3.423) we get: 


* d 2 u 

d 2 u* 

= 0 = 

h 2 d 

* du 

du*' 

dx 2 

dx 2 

2 to dx 

dx 

dx . 
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so that 


* du du rrT 

u - u - -W- constant 

dx dx 

for all x. 


For x> Xr , 

For x <Xl , 
Equating these 


u = A[ 1) » lkRX 


R 


W = 2ik R \A^ ) \ 


u = A^e lkLX + B™e- ikLX => W = 2ik L \a 
two expressions for the constant W we get 


(!) I 

L 


2ifcfJ J 4p 1 )| = 2ikL 1^1 r 1 '*I -2ik R \B 


(!) I 

L 


1 = 


knl^R 1 

fci U(i)| 


\B 


(i)| 


U 1 


(!) I 


= a + m 


as we found earlier. 


(3.425) 


2 ik L I B™ 


Summary of Important Results 

' A W e ik L X + B W e -ik L X for X<XL 

{x) = - A^e lkRX for x > xr 

complicated for Xr < x < xr 

This solution is represented by Figure 3.30 below. 


l CE 



incident 




Region where 
forces are present 


[X L .X„] 


B,e~ l 


Figure 3.30: u^ E (x) 


Now 


ip(x,t) 


1 


Eo+A 

f dE u ( ^ ) E (x)e~^ 

E 0 


represents a particle at t = -oo with the following properties: 


(3.426) 


(3.427) 
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1. It is localized, at x = -oo at t = -oo. 

2. It has energy approximately equal to E (previously called Eq) with a small 
spread A in energy values. 


3. It is traveling in the +x direction with speed Vl at t = -oo. 


The transmission and reflection coefficients (3 and 91) are functions of energy 
E and can be determined directly from Ugg(x): 

kR = k for transmitted wave = kt.rans 

kL = k for incident and reflected waves = ki nc = k re fi 

(3.428) 

(3.429) 

2 Mi)! 2 

^ ktrans \transmitted wave coef ficient\ kR |Ar | 

kinc {incident wave coef ficient\ 2 kL |^(i)| 

(3.430) 

{reflected wave coef ficientf \^l \ 

-'T 2 i 12 

| incident, wave coef ficient\ m W 

(3.431) 

91 + 9= 1 

(3.432) 

Given a potential energy V ( x ), it is simple to find the transmission and reflection 
coefficients as a function of the energy E. For a particle incident from the left 
with E > Vr > Vl, solve the time-independent Schrodinger equation for x). 


This will give B ^ and A ^ in terms of A^ and E. and Dl^ can then 
be calculated using the results just derived. For a particle incident from the 
right with E > Vr > Vl, solve the time-independent Schrodinger equation for 
u Ce ( x )• This will give B^ and A^ in terms of B^ and E. 3^ and 93^ 
can then be measured experimentally in the following way: 

1. D^ 1 - 1 and 91^: Send a beam of N particles of approximate energy E toward 
the interaction region [xl, xr] from the far left. If N<r are reflected and 
N? particles are transmitted, then 

IRl 1 ) = —— and 3^ = —— for large N (3.433) 

N N v ’ 

2. 3^ and 91^: Send a beam of N particles of approximate energy E toward 
the interaction region [xl,Xr\ from the far right. If N<r are reflected and 
N? particles are transmitted, then 

9^ 2 - 1 = and 3^ = for large N (3.434) 

3.8.2. General Comments 

1. Consider the potential energy function in Figure 3.31 below. Let E > Vr> 
V l and let E < V max 
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V(x) 



Classically, no particles can be transmitted because a transmitted particle 
would have to pass through a region in which E < V(x), that is, kinetic 
energy < 0. This is classically impossible! 

Note: * 0: The proof is simple: If A ^ = 0, then u^) E = 0 for all 

x > xr. Therefore, at a given point x > xr, = 0 and du EE /dx = 0. 
But these two values uniquely determine Uq E (x) for all x e [-oo,+oo] 
because the Schrodinger equation is a second-order differential equation. 
Thus, u E , E = 0 for all x € [-oo,+oo]. But an eigenfunction must not be 
identically zero. 

Quantum mechanically, 3 t 0 because + 0. Thus, there is a non¬ 
zero probability for a particle to be transmitted through the classically 
forbidden region. This phenomenon is called tunneling. 

2. Suppose that E > V ma . x for the potential energy shown in Figure 3.31. 
Classically, all particles will be transmitted - there cannot be any reflec¬ 
tion. Classically, a particle is reflected at a turning point, where v = 0 (v 
can change sign) or where E = V(x). For E > k) nax , kinetic energy > 0 
everywhere and there is no point where v = 0. 

Quantum mechanically, B^ can be non-zero (although at certain energies 
B^ can be zero) and there is a non-zero probability of reflection! 

3. Consider the potential energy function shown in Figure 3.32 below. 
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V(x) 



Classically , a particle of energy Eg and initially located between a and b 
will remain bound within this region for all t. 


Quantum mechanically, we can form a wave function 
Eq + 

f dEe~ 1 ^ \c { 1 \e)u { ^{x) + C { 2 \e)u ( c ) e {x)^ 

Eq 

(3.435) 

with A small and such that ip(x,t) is negligible outside the region [a, 6] 
at t = 0. However, lim t _» oc , ij){x , t) = 0 for any fixed (finite) x (see earlier 
discussion). Thus, the particle (the probability actually) eventually leaks 
out (tunnels out) of the bound region [a, b]. At t -*■ oo, the probability of 
finding the particle inside the bound region [a, b] is zero! 

One might refer to the state described by the wave function as an unstable 
bound state - at t = 0 the particle is bound within [a, b]: however, after 
a sufficiently long time, the particle will have completely tunneled out of 
the bound region. Contrast this situation with the case of a true bound 
state (well potential already discussed). This particular potential energy 
function has no bound states. A bound state u n (x ) is normalizable. 


Thus, 


i/j(x, t) = u n (x)e 1 % (3.436) 

is a possible wave function (with no integration over energies occurring). 
\ip(x,t)\ = \u n (x)\ for all t. In particular, 

lim \ip(x,t)\ = \u n (x)\ (3.437) 
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which approaches zero as |a;| -*■ oo and which is non-negligible in some 
bounded region of the z-axis - the particle stays bound for all time! 


3.9. The Symmetrical Finite Potential Well 

We now consider the potential energy function shown in Figure 3.33 below. 


Ae ,kx 


(incident) 

(reflected) 

- 

Be~ ikx 


V(x) 


▲ 


C6'^ X 

(transmitted) 




CD 


Figure 3.33: A potential energy function 

We found the bound states for this potential earlier. In this case, we have added 
a constant potential energy - Vo everywhere in space to the previous symmetrical 
well in order to obtain this symmetrical well. 

We have 


V R = V L = 0 
k R = k L = k = 


E> 0 



k-2 = 


2 m(E+V 0 ) 


h 2 


(3.438) 

(3.439) 


Let a particle of energy E > 0 be incident from the left. We therefore look for 
the eigenfunction u(x) of the form: 


l ( x ) = 


so that 


u 3 (x) 

= Ae ikx 

+ Be 

' ikx for x < -a/2 

Ul(x) 

= Ce ikx 


for x > a/2 

U 2 (x) 

= ae ik2X 

+ /3e 

-ik 2 x f or -a/2 < x < a/2 

3 = 

k\Cf_ 

|C| 2 

\B\2 

and 9a = —y = 1- 3 


k \A\ 2 

|A| 2 

\Af 


(3.440) 


(3.441) 


To find 3 we must find C in terms of A. 
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Note: V(x ) = V(-x) which implies that [II, H] = 0. However, the particular 
u(x) we are looking for (a particle incident from the left) is not a parity eigen¬ 
function. This does not contradict the compatibility of H and parity because the 
eigenfunctions of H for E > 0 are 2-fold degenerate and a given eigenfunction of 
H for E > 0 need not be a simultaneous eigenfunction of parity. Compatibility 
only implies that there exists 2 linearly independent simultaneous eigenfunctions 
of H and parity for E > 0. For a bound state, E n < 0, there is non-degeneracy 
and the corresponding eigenfunction of H must be an eigenfunction of parity 
also. 


Matching u(x) at the boundaries a/2 and -a/2 we have: 


ui(o/2) = u 2 (a/ 2) => Ce** = ae 1 — + pe~ l — 
dui(a/2 ) du 2 (a/2 ) 


ikCe 1 2 = ik 2 ae l 2 - ik 2 /3e 1 2 

• k 9. 


dx dx 

u 2 (-a/ 2) = u 3 (-a/ 2) => ae _i T 4 /3e 1 ^ = Ae~' 1 ^ + Be 1 ^ 
du 2 (-a/2) du 3 (-a/2) 


dx 


dx 


ik 2 ae 1 2 - ik 2 fde l 2 


_.■ ka _■ fca 

= ifcHe 2 - ikBe 2 


(3.442) 

(3.443) 

(3.444) 


(3.445) 


We have 4 linear equation in the 5 unknowns A, B,C,a, f3. We can solve for 
B, C, a, j3 in terms of A. A straightforward calculation gives: 


C _ 4e _ifea 

A 4cos(k 2 a) - 2i + ^) sin(k 2 a) 


(3.446) 


Thus, 


3= [Cf = _ 16 _ 

1^1 16cos 2 (k 2 a) + 4^^ 4 sm 2 (k 2 a) 

16 

16(l-sin 2 (/c 2 a))+4^4 sin 2 (k 2 a) 

16 _ 16 
16 + 4 ((T + ^) 2 - 4 )sin 2 (fc 2 a) 16 + 4(f-A) 2 sin 2 {k 2 a) 

=-j- (3.447) 

sin 2 (fc 2 a) 

with _ 

*->/¥■ <—> 

Notes: 
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1. £7 = 0 => fc = 0, fc 2 0 => 3 = 0 

2. _E-»-oo=>fc-»-A!2 : £0=>3-»-l 

3. 3 = 1 for & 2 a = iW , IV = 0,1,2,., where we must accept only energies 

E > 0, which further restricts N. 

27r 


/c 2 = — =► N = 

' a 2 


(A 2 /2) 


for 3 = 1 


(3.449) 


IV = a/( A 2 /2) corresponds to the number of half-wavelengths in the well 
and implies that this must be an integer for 3=1. The energies at which 
3=1 are called resonances. 

2 m(E + Vo)a 2 


3 = lforiVV = fc 2 a 2 = 


ft 2 


Therefore, 


E, 


resonance 


- -Vo + -y-—forfV = 1,2,3, 
2ma- 


(3.450) 


(3.451) 


with the restriction that only N values occur for which E res > 0 (N = 0 is 
clearly not allowed). It is interesting to note that the resonance energies 
occur at the bound-state energies of the well 


V(x) = 


-Vq for -a/2 < x < -a/2 


00 for |a;| > a/2 
A plot of of 3(E) is shown in Figure 3.34 below. 

m 


(3.452) 



Figure 3.34: Transmission coefficient showing resonances 


3.10. The Complex Energy Plane 

Let 

0 for x < Xl 

V(x) = - 0 for x > xr 

arbitrary V(x) for xl < x < xr 


(3.453) 
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where Vl = Vr = 0 and V ( x ) is a real function of x. 

We want to define a transmission coefficient 3(E) for all complex values of E. 
Such an object has physical significance only for E real and positive (where 
the physical continuum states are); however, it has interesting properties for 
complex E. 

The equation 

^ = f?[n*)-£]0 (3-454) 

has 2 linearly independent solutions for any complex E. The additional require¬ 
ment that \<j>\ does not become infinite as |x| -*■ oo restricts the allowed values of 
E to the real energies in the spectrum of the hermitian H. 

To define 3(E) for complex E, we simply drop this additional requirement and 
look at solutions to the differential equation for complex E. 

We have 



for x < Xl 

d 2 (j) 2 mE 

dx 2 ' h 2 V 

(3.455) 


for x>x R 

d 2 4> 2 mE 

dx 2 h 2 

(3.456) 

Let 

k = \ 

12m E 

1 h 2 

(3.457) 

Because E is complex, 

we must define \[E carefully. Let E = \E\ e le . 

Therefore, 


VE = \E\ 1/2 e i£ / 2 . 


Now, if e -*■ e + 2 nN, then E -+ E, that is, E does not change when we change 
its phase by a multiple of 27r, but \J~E -*■ \f~Ee ll ' N = (-1 ) N \f~E. Thus, \/~E is 
not well-defined unless we specify the range in which e lies. Let e be in [0, 27 t]. 
Then \/~E is well-defined except when E is positive real, that is, 

£ = 0 ^ s/E = \E\ 1/2 (3.458) 

but 

e = 27T => \fE = |£j 1/2 e i7V - - |£j 1/2 (3.459) 

Thus, there is a line of discontinuity in \/~E along the positive real axis (called 
a cut). This situation is shown in Figure 3.35 below in a plot of complex energy 
plane 
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£ —> 0 


Ve->| E\ m 


/ Complex \ 

' E Plane / 


' 


i 

‘ t 

cut 


£ —> 2k 

4e —> -|£| I/2 

Figure 3.35: Cut in Complex E Plane 
For E not on the cut, 


£ . . £ 
cos - + i sin - 
2 2 


Ve = \e\ 1/2 

with e in [0,27 t] implying that Im(\/^) > 0. Now if 


2 mE 

k=\! —— = a + ip 


(3.460) 


(3.461) 


then /3 > 0 for E not positive real. We note that k -*■ positive real if we approach 
the cut from above (e -> 0). 


Now, 

for x < x L cj) = Ae ikx + Be~ ikx 
for x>x R </> = Ce ikx + De~ ikx 

with E and k complex. 

C and D may be chosen as the 2 arbitrary constants in the solution of the 


(3.462) 

(3.463) 
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2 nd -order differential equation. A and B will then depend on C and D as well 
as E. Consider the solution for D = 0 and (7=1. 

for x <x L <j> = A(E)e ikx + B(E)e~ ikx (3.464) 

for x > xr f>- e lkx (3.465) 

Define 


for complex E 

For E positive real, this is the physical transmission coefficient if k > 0, that is, 

A(E)e lkx is the incident wave in the + x direction (3.467) 

B(E)e~ lkx is the wave reflected toward the - x direction (3.468) 
e lkx is the transmitted wave in the + x direction (3.469) 

Thus, the physical 3(E) is obtained as we approach the cut from above (e -» 0). 

For positive, real E and k > 0, we have 3 + 91 = 1 with 3 and 91 non-negative 
(this is not true for complex E). Thus, 3 t oo as we approach the cut from 
above. 

Does 3 = oo at any complex value of El Equivalently, does A(E) - 0 at any 
complex value of El We have 

± . , \A(E)e iax e- 0x + B(E)e- iax e +0x 
^ X ~ | e iax e~P x 

with (3 > 0 except on the cut (E positive real). 

For f3 > 0, A(E) = 0 if and only if f is normalizable because 

e ~3 X QQ fQJ. X _ 00 

e +P x -» 0 for x -*■ -oo 
e -/3z 0 for x -» +oo 

But <f> is normalizable if and only if E is a bound state energy eigenvalue. 
Furthermore, the hermiticity of H implies that these energy eigenvalues occur 
only for real E. For Vr = Vl = 0, the bound states (if they exist) have energy E 
real and negative. Thus, in the cut complex energy plane (f3 > 0). 

Thus, 3(E) = oo (called poles in complex E plane) if and only if E = a bound- 
state energy as shown in Figure 3.36 above. 


for x < Xl 
for x > Xr 


(3.470) 


3(E) = 


I A(E)Y 


(3.466) 
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L 

Complex E Plane 
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t 
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*• Re(E) 


7(E„)-00 


Figure 3.36: Cut and Poles in Complex E Plane 


We have also shown that 3(E) is finite as we approach the cut from above 
(e -*■ 0 ). 


As a check, consider the symmetrical potential well above. We have 

2 

n 2 Ae~ ika 

3(E) = 


if 


c 

2 

4g-ifca 

A 


4cos(fc 2 a) -2?:(^- + ^jsin(A: 2 a) 


— + — 1 sinfA^a) 

k ru2 / 


4 = 2 ?: (— + — ^ sin ( fc2a ) 


\k h I 

V t i:, I 

\ k k 2 ) 


2 7 cos(k 2 a) 

2 sin(/c 2 a/2) cos(k 2 a/2) 


fc 2 7 cos 2 (k 2 a/2) - sm z (k 2 a/2) 
= 2i | — + — 1_ 2 _ 

t c 2 7 cot(fc 2 a/2) - tan(fc 2 a/2) 


Therefore, ^(£7) = oo if 

( A/O A/ \ 

— + — ] 
fc k 2 ) 


or 


cot(k 2 a/2) = and tan(fc 2 a/2) = - ^ 
ik rC2 


Now let k = ik. We then have 


cot(fc 2 a/2) = - and tan(fc 2 a/2) = 


ka/2 


k 2 a/2 


(3.471) 


(3.472) 


(3.473) 

(3.474) 


(3.475) 
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for 3(E) = oo. These are just the transcendental equations which determine the 
bound states (look at our earlier solution and recall that -Vo must be added to 
V(x). 


3.11. Problems 

3.11.1. Free Particle in One-Dimension - Wave Functions 

Consider a free particle in one-dimension. Let 

f \2 

(X-Xq ) . Pfi X 

ip(x,0) = Ne 

where Xq, Po and a are real constants and TV is a normalization constant. 

(a) Find^(p,0) 

(b) Find ’4>(p,t) 

(c) Find ip(x,t) 

(d) Show that the spread in the spatial probability distribution 


p(x,t) = 


(ip(t) | ip(t)) 


increases with time. 


3.11.2. Free Particle in One-Dimension - Expectation Val¬ 
ues 


For a free particle in one-dimension 



2 m 


(a) Show (p x ) = (p x ) t=0 

(b) Show (x) = t + (x) t=0 

(c) Show (A p x ) 2 = (A p x ) 2 t=0 

(d) Find (Ax) 2 as a function of time and initial conditions. HINT: Find 


d_ 

dt 


(x 2 ) 


To solve the resulting differential equation, one needs to know the time 
dependence of (xp x + p x x). Find this by considering 


— (xp x + p x x) 
dt 
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3.11.3. Time Dependence 

Given 


with 


ot 


H= y^+V(x) 

2m 


(a) Show that (f/’( t ) | ip(t)) = 0 

(b) Show that | (x) = (^) 

(c) Show that £ t {p x ) = (~ fy) 

(d) Find | (H) 

(e) Find A (L z ) and compare with the corresponding classical equation [L = x x p) 


3.11.4. Continuous Probability 

If p(x) = xe~ x ^ x is the probability density function over the interval 0 < x < oo, 
find the mean, standard deviation and most probable value(where probability 
density is maximum) of x. 


3.11.5. Square Wave Packet 

Consider a free particle, initially with a well-defined momentum po> whose wave 
function is well approximated by a plane wave. At t = 0, the particle is localized 
in a region -a/2 < x < a/2, so that its wave function is 

\Ae ipoX l h -al2<x<a/2 
ip{x) = \ 

0 otherwise 

(a) Find the normalization constant A and sketch Re('0(a’)), In^'i/^a;)) and 

IV’(x )! 2 

(b) Find the momentum space wave function ip(p) and show that it too is 
normalized. 

(c) Find the uncertainties Ax and A p at this time. How close is this to the 
minimum uncertainty wave function. 


3.11.6. Uncertain Dart 

A dart of mass 1 kg is dropped from a height of 1 to, with the intention to hit 
a certain point on the ground. Estimate the limitation set by the uncertainty 
principle of the accuracy that can be achieved. 
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3.11.7. Find the Potential and the Energy 

Suppose that the wave function of a (spinless) particle of mass to is 


tp(r,0,<j>) = A- 


where A, a and /? are constants such that 0 < a < fl. Find the potential V (r, 6 , <p) 
and the energy E of the particle. 


3.11.8. Harmonic Oscillator wave Function 

In a harmonic oscillator a particle of mass to and frequency w is subject to a 
parabolic potential V(x ) = tow 2 ® 2 / 2. One of the energy eigenstates is u n (x) = 
Axexp(-x 2 /xq), as sketched below. 



Figure 3.37: A Wave Function 

(a) Is this the ground state, the first excited state, second excited state, or 
third? 

(b) Is this an eigenstate of parity? 

(c) Write an integral expression for the constant A that makes u n (x) a nor¬ 
malized state. Evaluate the integral. 


3.11.9. Spreading of a Wave Packet 

A localized wave packet in free space will spread due to its initial distribution of 
momenta. This wave phenomenon is known as dispersion, arising because the 
relation between frequency w and wavenumber k is not linear. Let us look at 
this in detail. 

Consider a general wave packet in free space at time t = 0, x , 0). 

(a) Show that the wave function at a later time is 

r~ oo 

ip(x,t) = / dx' K(x, x'; t)i/j(x') 
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where 


K(x, x'\t ) 


2mht 


exp 


im(x - x') 2 
2 ht 


is known as the propagator. [HINT: Solve the initial value problem in the 
usual way - Decompose ip(x,0) into stationary states (here plane waves), 
add the time dependence and then re-superpose]. 


(b) Suppose the initial wave packet is a Gaussian 


ip(x, 0 ) 


^ ik 0 x -x 2 /Aa 2 

(27ra 2 ) 1 / 4 


Show that it is normalized. 


(c) Given the characteristic width a, find the characteristic momentum p c , 
energy E c and the time scale t c associated with the packet. The time t c 
sets the scale at which the packet will spread. Find this for a macroscopic 
object of mass 1 g and width a = 1cm. Comment. 

(d) Show that the wave packet probability density remains Gaussian with the 
solution 


P{x,t ) = \tp(x, t)\ 2 = exp 

y/2n a(t) 2 


(x - ftfco/m) 2 
2a(f) 2 


with a(t) = a\J 1 + < 2 /t 2 . 


3.11.10. The Uncertainty Principle says ... 

Show that, for the 1-dimensional wavefunction 

«u4( 2 “ r,/2 jt“ 

10 |a:| > a 

the rms uncertainty in momentum is infinite (HINT: you need to Fourier trans¬ 
form ip). Comment on the relation of this result to the uncertainty principle. 


3.11.11. Free Particle Schrodinger Equation 

The time-independent Schrodinger equation for a free particle is given by 

Zm \ i oxJ 

It is customary to write E - h 2 kr /2m to simplify the equation to 

(V 2 + k 2 )ip(x) = 0 

Show that 
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(a) a plane wave = e xkz 
and 

(b) a spherical wave ij>(x) = e lfcr /r (r = \Jx 2 + y 2 + z 2 ) 

satisfy the equation. Note that in either case, the wavelength of the solution is 
given by A = 27r/fc and the momentum by the de Broglie relation p = hk. 

3.11.12. Double Pinhole Experiment 

The double-slit experiment is often used to demonstrate how different quantum 
mechanics is from its classical counterpart. To avoid mathematical complica¬ 
tions with Bessel functions, we will discuss two pinholes rather than two slits. 
Consider the setup shown below 



Figure 3.38: The Double Pinhole Setup 
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Suppose you send in an electron along the 2 -axis onto a screen at 2 = 0 with two 
pinholes at x = 0, y = ±d/2. At a point ( x,y ) on another screen at 2 = L » d, A 
the distance from each pinhole is given by r± = \Jx 2 + (y T d/2)' 2 + L' 1 . 

The spherical waves from each pinhole are added at the point on the screen and 
hence the wave function is 


■ikr+ ikr _ 

i/j(x,y) = -+- 

r+ r_ 

where k = 2tt/X. Answer the following questions. 

(a) Considering just the exponential factors, show that constructive interfer¬ 
ence appears approximately at 

^ = n (ra s Z) (3.476) 

r a 

where r = \/c 2 + y 2 + L 2 . 

(b) Make a plot of the intensity \ip(Q, y) | 2 as a function of y, by choosing k- 1, 
d = 20 and L = 1000, Use the Mathematica Plot function. The intensity 
\ip(0,y)\ 2 is interpreted as the probability distribution for the electron to 
be detected on the screen, after repeating the same experiment many, 
many times. 

(c) Make a contour plot of the intensity \ijj(x,y)\ 2 as a function of x and y, 
for the same parameters, using the Mathematica ContourPlot function. 

(d) If you place a counter at both pinholes to see if the electron has passed 
one of them, all of a sudden the wave function collapses. If the electron 
is observed to pass through the pinhole at y - +d/ 2, the wave function 
becomes 

e ikr + 

il>+(.x,y) = - 

r+ 

If it is oberved to pass through the pinhole at y = -d/2, the wave function 
becomes 

ip-(x,y) = - 

r_ 

After repeating this experiment many times with a 50:50 probability for 
each of the pinholes, the probability on the screen will be given by 

\ip + (x,y)\ 2 + \ip-(x,y)\ 2 

instead. Plot this function on the y-axis and also show the contour plot 
to compare its pattern to the case when you do not place a counter. What 
is the difference from the case without the counter? 
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3.11.13. A Falling Pencil 

Using the uncertainty principle estimate how long a time a pencil can be bal¬ 
anced on its point. 
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Chapter 4 

The Mathematics of Quantum Physics: 
Dirac Language 


Our study of the mathematics of quantum mechanics assumes that you have a 
good background in the following areas: 

• Calculus 

Differentiation and integration 

• Infinite series 

Taylor and power series 

• Linear Algebra 

Linear vector spaces and matrices 

• Multivariable Calculus 
Partial derivatives 

Gradient, divergence, curl and Laplacian 

• Mathematical Methods 

Ordinary and partial differential equations 
Fourier series and Fourier transforms 

This study of the mathematical formalism of quantum theory will center around 
the subject of linear vector spaces. We will present this subject using Dirac 
language and connect it to the physics as we proceed. At the start of our 
discussion, we will mix standard mathematical notation and the Dirac language 
so that the transition to using only the Dirac language will be easier. We will 
see parallels to the mathematics used in our study of wave mechanics in earlier 
chapters; we will repeat many ideas from earlier in this new formalism. 

Quantum systems cannot be described in terms of our everyday experience. To 
understand them it is necessary to use the abstract language provided by the 
algebra of linear operators on Hilbert space. 
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When we present the mathematical formalism appropriate to a physical theory 
we have two choices. We can approach the subject abstractly and deal directly 
with mathematical quantities of fundamental importance or we can choose a 
particular a particular representation (abstract coordinate system) and work 
with the numbers (functions) that correspond to the fundamental quantities in 
that coordinate representation. 

We will follow the abstract approach in most cases since it will allow us to delve 
more deeply into the fundamental nature of the physical theory and, in addition, 
enable us to treat the physical laws in a precise and efficient manner. 


4.1. Mappings 

Given two sets A and B , let a e A, b e B. A mapping T from A to B: 


b = Ta 


(4.1) 


can be of the following types: 

1. T is a mapping of A into B if to each a e A there corresponds a definite 
element Ta e B (there may be elements of B that do not correspond to 
any element of A and different elements of A may correspond to the same 
element of B). The range of the mapping is the subset TA c B (TA 
is a subset set of B but not equal to B) formed by those elements that 
correspond to some elements of A. 

2. T is a mapping of A onto B if to each element of A there corresponds a 
definite element Ta € B and to every element of B there corresponds at 
least one element of A (in this case Ta = B). 

3. A one-to-one mapping is where distinct elements of A correspond to dis¬ 
tinct elements of B: Ta + Tb if a + b. 

4. It follows that if T is a one-to-one mapping from A onto B, there exists 
an inverse one-to-one mapping T -1 from B onto A. Such an inverse does 
not exist if T is a one-to-one mapping A into B. 


4.2. Linear Vector Spaces 

The mathematical formalism of quantum mechanics is based on the theory of 
linear vector spaces. In this chapter we shall present a complete exposition of 
the mathematics of linear vector spaces. We shall state theorems without proofs 
(excellent proofs are available in a variety of texts and equivalent proofs often 
have been given in Chapters 2 and 3 of this text). Instead of proofs, we shall re¬ 
late the theorems where appropriate to the physics of quantum mechanics using 
the Dirac language and provide concrete examples that will help us understand 
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the physical content in our later discussions. 


The number of configurations experimental instruments can exhibit is finite. 
This implies that, in principle, only the language of finite-dimensional vector 
spaces will be needed to explain experimental results and to understand the 
theoretical structure of quantum mechanics. However, if we want to embed 
the theory in a spacetime continuum, then it will be necessary to consider ide¬ 
alized instruments capable of an infinite number of configurations. This will 
require a description using the language of infinite-dimensional spaces, in par¬ 
ticular the use of a vector space called a non-separable or rigged Hilbert space. 
Because these idealized infinite instruments are approximations of the actual 
finite ones, physicists usually ignore those properties of the infinite-dimensional 
Hilbert space that cannot be derived from the properties of finite-dimensional 
spaces by some, not necessarily unique, physically based limiting procedure. We 
already saw some of these problems discussed for continuum wave function in 
wave mechanics in Chapters 2 and 3. 

A working knowledge of the mathematical description that results from the 
adopted limiting procedure is necessary to understand many of the develop¬ 
ments of quantum mechanics. The following mathematical presentation reflects 
these considerations. The results pertaining to finite-dimensional spaces, neces¬ 
sary for the understanding of the structure of quantum mechanics, are presented 
with thoroughness. The generalization to infinite-dimensional spaces, which is 
a very difficult task, is discussed in less detail. 

These mathematical details are usually ignored in most textbooks, which I be¬ 
lieve makes it very difficult to understand the fundamental ideas underlying 
quantum mechanics. 

In most of our discussions we can assume we are in a finite-dimensional vector 
space and the results will generalize without change to the Hilbert space case. 
We will deal with the special problems associated with the infinite dimension¬ 
ality of Hilbert space as they arise. 


4.3. Linear Spaces and Linear Functionals 

A vector space is defined with reference to a field. We consider the case where 
the field is the field C of complex numbers because this is the case of interest 
in quantum mechanics. 

Ket Vectors 

The mathematical vectors that will allow us to describe physical states will 
be called ket vectors or kets and be represented by the symbol |...) (due to 
Dirac). We will label different vectors (states) according to their associated 
physical(measurable) properties and these will be inserted inside the ket symbol 
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\a,b ,...). The ket vectors will form the basis for the Dirac language of the 
associated vector space defined below. 

These simple mathematical objects will turn out to have sufficient mathematical 
structure to allow us to represent all of the important features of quantum 
theory in terms of them. Whether these mathematical objects have any objective 
physical content themselves must be discussed later as we develop the theory. 

At this point, however, let us state a couple of the basic properties of the ket 
vectors and their connection to physics to help set the stage for the discussion 
that will follow. 

As with any other vectors, kets can be multiplied by complex numbers and can 
be added together to get another ket vector 

|<z) = Ci |o> + c 2 \b) (4.2) 

where C\ and c-i are complex numbers. 

The crucial assumptions we will make later as we connect the Dirac language, 
the mathematical formalism and the physics are: 

• Each state of a physical system at a given time can be mathematically 
described by a definite ket vector. There is some correspondence! 

• If a physical state is a superposition(sum) of other physical states, its 
corresponding ket vector is a linear sum (combination) of the ket vectors 
corresponding to the other states 

The state |g) in (4.2) is a superposition of the states |a) and | b) with the math¬ 
ematical properties of this superposition defined precisely by the two complex 
numbers C\ and ci- For example, the state of a photon passing through a 
double-slit apparatus might be described by the superposition 

|photon in 2-slit apparatus) = a|slit 1) + 6 |slit 2) (4-3) 

where |slit 1) is the state of a photon that passes through slit 1 and so on. 

With these tantalizing thoughts rambling about in your minds, let us turn to 
the mathematical formalism. 

Definition: 

A linear vector space V is a set of abstract elements, called vectors, 

|1>, |2>, • • • > \M) ,..., \N) ,... (4.4) 


for which there exists two rules: 
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1. a rule for creating a vector sum or vector addition, 

17}+ |12> 

2. a rule for multiplication of a vector by a complex scalar c, 

c 113} 

The following properties hold: 

1. the result of applying either or both of these rules is another vector in the 
same space; this is called closure 

Ci |7} + c-i 112) e V 

2. scalar multiplication is distributive with respect to the vectors 

c (|7} + 112}) = c|7} + c 112} 

3. scalar multiplication is associative 

ci(c 2 |7}) = cic 2 17} 

4. addition is commutative 

|7} + |12} = |12}+ |7} 


5. addition is associative 

|6} + (|7} + |12» = (|6} + |7}) + |12> 

6. there exists a unique zero or null vector |0} such that 

|7} + |0} = |7} and O|M} = |0} 

7. for every vector | M) there exists a unique additive inverse vector \-M) 
where 

| M) + | -M) = |0} and \-M) = -1 M) 

Example: Consider the set of all 3-tuples, which is a particular example of a 
finite-dimensional vector space 


I*) 


h 

W 


(4.5) 
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(4.6) 


(4.7) 


(4.8) 


We must emphasize that use of the word “vector” here is not meant to imply 
that we are talking about a mathematical object that in any way needs to 
possess properties such as magnitude and direction. These “vectors” are abstract 
mathematical objects whose properties are defined above. 

Other examples are: 

1. the set of all functions /( x) of a real variable x for which 

f I f( x ) I 2 dx < 00 

with addition and multiplication by a scalar defined by 

(/ + g)( x ) = f( x ) + g( x ) and (o/)( x) = af(x ) 

This space is called L 2 . 

2. the set of all infinite sequences of numbers x = (x\,X 2 ,X 3 ,..., Xi ,...) such 
that 

x + y = (x! + yi,x 2 + y2,x 3 + y 3 ,... ,x k + Vk, ■ ■ ■) 

with addition and multiplication by a scalar defined by 
(/ + g) (aO = f(x ) + g(x) ( af ) (x) = af(x) 

ax = (axi, ax 2 , ax 3 ,..., axk, ■ ■ •) 

3. the set of all 2a;2 matrices 

^ _ (a n ai2 
\a 2 i a 22 


244 



with addition and multiplication by a scalar defined by 


A + B-l a u a i2\ + (bn bw\ _ (an + bn ai-2 + bi2 
\«21 022 / \^21 ^ 22 / \«21 + ^21 «22 + &22 

cA = c[ ai1 ai2 \ = ( CCLl1 cai2 
\021 o 22 / \ca21 ca 2 2 

These examples should make it clear that it is important, as we have 
emphasized already, not to think about magnitudes and directions as being 
the defining properties of “vectors” in any way. We must be very careful 
not to attribute any specific properties derived from special examples as 
general properties of vectors in any abstract vector space. 

Isomorphism -Two vector spaces U and V (defined over the same field) are said 
to be isomorphic if there exists a one-to-one and onto correspondence u <-» v 
between every vector u e U and every other vector v e V that preserves the 
linear operations, that is, au\ + ( 3 u 2 «-»• av\ + ftv 2 whenever u\ «-+ Vi and 112 <-*• V 2 
for U\,U 2 e U and i’i,i >2 6 V. Such a correspondence is called an isomorphism. 
If one is only interested in linear operations on vectors, then two isomorphic 
spaces are indistinguishable from each other. 

Definition: A set of vectors is said to be linearly independent if a linear relation 
of the form 

n 

Y J c k \k) = \0) (4.9) 

k= 1 

implies that = 0 for all fc; otherwise the set of vectors is linearly dependent. 

If a set of vectors is linearly dependent, then we can express a member of the 
set as a linear combination of the other members of the set. 

Examples: 

1. Consider the set of vectors (3-tuples in this case) 

/i\ /OX /OX 

|1> = 0 |2> = 1 |3> = 0 

w w w 

This set is linearly independent since 

( ai \ /OX /OX /aA /OX 

a\ |1) + 02 |2) + 03 |3) = 0 + a 2 + 0 = I 02 = |0) = 0 

V 0 / \ 0 / W \ a 3/ \0> 

implies that the only solution is 

ai = a 2 = 03 = 0 
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2. Consider the set of vectors (3-tuples in this case) 



( 1 1 


fl\ 


/0\ 

ll) = 

-1 

|2> = 

1 

W 

|3> = 

0 

w 


This set is linearly independent since 




/ a 2 s 

( 0 ) 


'’oi + a 2'' 


/°\ 

&i |1) + ct2 |2) + 03 |3) - 

oi 

+ -02 

+ 0 

= 

Oi - a 2 

= |0) = 

0 


\o) 

l 0 ) 

\ a 3 j 


l «3 ) 


\o) 


implies that the only solution is 

Oi + 02 = Or - (Z2 = CI3 = 0 or Oi = 02 = 03 = 0 


3. Consider the set of vectors (3-tuples in this case) 



/D 


l 1 \ 


/ 2\ 

|1> = 

1 

v 1 ; 

|2> = 

-1 

(O) 

|3) = 

0 

U/ 


This set is linearly independent since 


0111) + 02 |2) + 03 |3) 


'oi> 

/ a 2 ' 

(2a 3 \ 


^Oi + O 2 + 203^ 

Oi 

+ -o 2 

+ 0 

- 

Oi - o 2 

UJ 

l 0 ) 

\ «3 ) 


Oi + 03 t 



implies that the solution is 


a\ + 02 + 2 o 3 = oi - 02 = Oi + 03 = 0 or 


a 2 = oi,o 3 = -oi 


Note: For simplicity and in all cases where there no ambiguity will arise, we 
will simplify the notation for the "null" vector from now on. We will write 

n n 

Y^Ck\k) = 0 instead of ^ c^. \k) = |0) (4-10) 

k =1 k =1 

We say that an infinite set of vectors is linearly independent if every finite 
subset is linearly independent. Alternatively, we can use this method: if the 
determinant of the matrix formed by using the vectors as columns is equal to 
zero, then the vectors are linearly dependent. 


Definition: The maximum number of linearly independent vectors in a space 
V is called the dimension of the space dim(V). 

Definition: A set of vectors \k ), k = 1,2,3,..., n spans the space if every vector 
|Q) in the space can be written as a linear combination of vectors in the set 

n 

I Q) = Zdk\k) (4.11) 

k =1 
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This linear combination, which is given by the coefficients qu,k = 1,2, ...,n is 
unique. 

Definition: A set of vectors is a basis for the space if it is a linearly independent 
set and spans the space, that is, if dim(V ) = to, a set of m linearly independent 
vectors is called a basis on V. 


The set of vectors 



/i\ 

(o) 


/0\ 

ll> = 

0 

|2> = 1 

|3} = 

0 


W 

\oJ 


w 


is the maximal set of linearly independent vectors since any other vector | g) in 
the space can always be written as a linear combination of them as 


/ai'' 


/°i 


(0\ 

/ai'' 

1 g) = ai 1) + «2 |2) + a 3 |3) = I 0 

+ 

a 2 

+ 

0 

= a 2 

\oJ 


\o) 


\ a 3 / 

\ a 3/ 


(4.13) 


Therefore the dimension of this vector space is 3. This set of vectors is also a 
basis. The basis is not unique since the set of linearly independent vectors 




I 1 ) 


/°\ 

|1> = 

i 

|2)= -1 

|3> = 

0 


(o) 

V 0 J 


w 


also spans the space, i.e., 


(4.14) 


1 g) = Cl |1> + c 2 12) + c 3 13) = 

'a\^ 
a 2 


'c\ + c 2 '* 
Cl + c 2 




l C 3 


(4.15) 


implies that 


2 ci = a\ + a 2 2c 2 = ai - a 2 c 3 = a 3 


(4.16) 


and, thus, this set is also a basis. Clearly, a basis spans the whole of V. 


Definition: The coefficients in the expansion of an arbitrary vector | Q) in terms 
of a basis set \k ), k = 1,2 ,3,..., n 

n 

\Q) = Y,<lk\k) (4.17) 

fc=i 

are called the components of the vector | Q) in that basis. 


Example: In the space of 3-tuples, a basis is represented by the three vectors 



(1\ 


/°) 

(0\ 

ll> = 

0 

|2> = 

1 

|3> = I 0 




(o) 

U/ 
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so that an arbitrary vector in the space can be written 



A-]' 


m 


(0\ 


'Ai'' 

Iff) - a i |1) + °212) + a313) - 

0 

+ 

02 

+ 

0 

= 

a 2 


( 0 ) 


\0) 


l°3> 


\a 3 ) 


so that ai, <22 and <23 are the components. 

When we add vectors(must be defined with respect to the same basis) we simply 
add their components 

10) + \R) = Z ft \ k ) + Z ft \ k > = Z (ft + ft) I*) ( 4 - 20 ) 

k= 1 k=1 fc=l 

Subspaces - A subset of a vector space V, which is also a vector space, is a 
subspace, that is, it contains all the linear combinations of any number of its 
vectors - it said to be closed. The smallest subspace that contains the set S of 
vectors is said to be spanned by S. We note that the intersection M n N - in the 
sense of set theory - of two subspaces M and TV is a subspace, but, in general, 
their union M u N is not. 


4.4. Inner Products 


The vector spaces we have been discussing do not need to contain vectors that 
have a well-defined length or direction in the ordinary sense (remember the 
example of the 2x2 matrices). We can, however, define generalizations of length 
and direction in an arbitrary space using a mathematical object called the inner 
product. The inner product is a generalization of the standard dot product. I 
will first discuss the inner product using standard mathematical notation and 
then return to the Dirac language. 

Definition: An inner product for a linear vector space associates a scalar (f,g) 
with every ordered pair of vectors /, g. It satisfies these properties: 

1 . 

(/, g) ~ complex number 

2 . 

(f,g) = (gJY 


3. 


(f,c 1 g 1 +C 2 g 2 ) = ci(/,<?i) + c 2 (/, < 72 ) 


4. 


(/, /) >0 with equality if and only if f = null vector 
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Now(2) and (3) above imply that 


(cigi + c 2 g 2 , f) = Ci(/, 5 i) + 4(/,g 2 ) (4.21) 

Hence, the inner product is said to be linear in its second argument and anti- 
linear in its first argument. 

Definition: The non-negative number ||/|| = (/, f ) 1 ^ 2 is called the norm or 
length of the vector /. Clearly, ||/|| = 0 if and only if f = 0. 

Definition: If the inner product of two vectors is zero, then the vectors are 
orthogonal. 

Definition: A set of vectors {/)} is said to be orthonormal if the vectors are 
pairwise orthogonal and of unit norm. We can summarize this property by the 
equation 

(/*,/,) = <% = { J (4.22) 

where the symbol 5. t j is called the Kronecker delta. 

Schwarz’s Inequality: scalar products satisfy 

I (Lg) l 2 ^ (fJ)(g,g) 

Triangle Inequality: scalar products satisfy 

l(/ + 2 )ll < ll/ll+ NI 

Equality holds in both cases only if one vector is a scalar 
i.e., f = eg where the scalar c is real and positive. 

An inner product space is simply one in which an inner product is defined. A 
unitary space is a finite-dimensional inner product vector space. In fact, every 
finite-dimensional space can be made into a unitary space. 

Examples: 

1. For the case of n-tuples(an n-dimensional vector space) 

f = (x 1 ,x 2 ,x 3 ,...,x n ), , g = (yi,y2,y3,---,y n ) 

n 

( f,9 ) = T, x kVk 

k= 1 

2. For the case of square integrable functions 

(f,g) = f f*( x )g(x)w(x) dx 
where w(x) is a nonnegative weight function. 


(4.23) 

(4.24) 

multiple of the other, 
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We used these weight functions in our earlier wave mechanics discussions (we 
will see why later). 

When a vector Q is expanded as a linear combination 

Q = T,9kfk (4-25) 

k 

of orthonormal basis vectors fk, the coefficients or components are given by the 
inner product, i.e., 


(fj ,Q) = £ Qk(fj ,fk) = T, 5k o = 3j ( 4 - 26 ) 

k k 

or 

Q = T.Uk,Q)h (4.27) 

k 

Of the two basis vector sets for the 3-dimensional vectors we looked at earlier: 



(V 


/G\ 


'°\ 

ll> = 

0 

|2> = 

1 

3) = 

0 


\0> 


\o/ 


4/ 

is an orthonormal basis and 






G\ 


' 1 ' 


/°\ 

|1> = 

1 

|2) = 

-1 

3) = 

0 


. 0 / 


,0, 


\i/ 


(4.28) 


(4.29) 


is orthogonal but not orthonormal. 
For later use we note the following. 


Given two unitary spaces U and V, their direct sum W = U © V is the unitary 
space consisting of the pairs {/, g} written f®g,feU,geV with the operations 
defined by 


a i/i © 3 i + 012/2 ©32 - (c*i/i + 0.2/2) © (c*i 3 i + <^232) ( 4 . 30 ) 

(/1 © 9 i,h © 3 2 ) = (/i) Z2) + (31*32) ( 4 - 31 ) 

This is a definition that can clearly be extended to any finite number of spaces. 

4.5. Linear Functionals 

Linear complex-valued functions of vectors are called linear functionals on V. 
A linear functional T assigns a scalar value T(/) to each vector / in the vector 
space V, such that linearity holds 

r(ci/i + C2/2) = d r(A) + c 2 r(/ 2 ) (4.32) 
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for any vectors fa and fa and any scalars c\ and c 2 . 


The set of linear functionals on V form a linear space V' where we define addition 
and scalar multiplication by 



(r 1 + r 2 )(/) = r 1 (/) + r 2 (/) 

(4.33) 


(cT)(/) = cT(/) 

(4.34) 

The vector 

space V' is called the dual space of the vector space V. 


There is a one-to-one correspondence between linear functionals T in V' and the 
vectors / in V, such that the linear functional associated with / has the form 


r /(«?) = (/, 5 ) 

(4.35) 

where / is 
theorem. 

a fixed vector and g is an arbitrary vector. This is called the Riesz 

Using the properties of the inner product we can show that 



T f + T g = Tf +g and aT/ = T 0 »/ 

(4.36) 

or 

T f (h) + r g (h) = r u+g) (h) 

(4.37) 


(f,h) + (g,h) = ( f + g,h ) 

(4.38) 

and 

aTf(h) = a(f, h ) = ( a* f , h) = Y a * f 

(4.39) 


The scalar product is clearly antilinear in T and linear in f. 

The vector / that corresponds to a given linear functional Tf is easily found by 
direct construction. 

Let {a 1 } be a set of orthonormal basis vectors in V (this means that ( cti,aj ) = 
Sij), and let 

4>=Yj c n a n ( 4 . 40 ) 

n 

be an arbitrary vector in V. Then we have from its definition 

Tf(4>) = (E C -«n) = E C " r /(«n) ( 4 - 41 ) 

\ n In 

We now choose 

/ = E[r/(«n)]‘an ( 4 - 42 ) 
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Its inner product with the arbitrary vector <f> is 


(fi4 , ) = \Y[ r f( a n)Y a mY c ™ a ™) = Y [ T f( a n)]cm(a n ,a m ) (4.43) 

= Y [ r f( a n)]Cm$nm = X![ r /( a «)] c n = r /(^) ( 4 - 44 ) 

n,m n 

So the vector / corresponding to Tf is given by 

/ = £[ r /(«»)r«n (4.45) 

n 

4.6. The Dirac language of Kets and Bras 

Let us rewrite everything we have done so far in the Dirac language. 

Vectors in the linear space V are called kets or ket vectors and denoted by |a). 
Linear functionals Tf> in the dual linear space V' are called a bras or bra vectors 
and are denoted by (b\. 

There is a one-to-one correspondence between kets in V and bras in V' or 
between vectors and linear functionals. We will use the same label to denote 
these corresponding quantities 

|a) ++ (a| (4.46) 

The inner product between two vectors |a) and | b) in V corresponds to the 
linear functional of the left-hand vector assigning a scalar value to the right- 
hand vector 

T a (6) = (a, b) = (a | b) = complex number (4.47) 

The orthonormality property of a basis set {a?:} is expressed by the linear func¬ 
tional relation 

(a i \a j ) = 6 ij (4.48) 

The expansion of an arbitrary state in an orthonormal basis is given by 

\q) = Y a r> \ a n) (4.49) 

n 

and the expansion coefficients are 

(cfra | q) = CL n (c%m \ ) = ^n^mn = (4.50) 

n n 

or a m is the linear functional T am ( q ) of the state | q) with respect to the corre¬ 
sponding basis vector \a m ). 

If we have two vectors | a) and | b) and we expand them in an orthonormal basis 
set {«i} as 

\a) = Y a n \ a n) and |6} = ^5„|a„) ( 4 -51) 
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(4.52) 


(a|6> = XX 5 » = ( b \ a )* = (E 6 n««) 

where we have used the antilinear property of inner products that says that we 
have the correspondence 

c|a} <->■ c* (a| (4.53) 

The linear functional itself is directly represented by the bra vector, i.e., 

<g|- = r,(-) = (<?,-) (4.54) 

In Dirac language, there is nothing to distinguish between the value of the linear 
functional (q\ for the vector \p) and the inner product of the vectors | q) and \p). 
They are both (q\p). 


4.7. Gram-Schmidt Orthogonalization Process 

An orthonormal basis set for an n-dimensional vector space can always be con¬ 
structed from any set of n linearly independent vectors using the Gram-Schmidt 
orthogonalization method. 

Suppose that we have a set of n linearly independent vectors | on),i = 1,2,... ,n 
that are not a mutually orthonormal set. We can construct a mutually orthonor¬ 
mal set \0i),i = 1,2using the following steps: 

1. let 

lA > = K) 

2. let 

102 } = |a 2 } + ai | 0 i} where we choose ai such that (0i 1 02 } = 0 

3. this gives 

(01 I 02 } = 0 = (01 | <32 } + 0,1 (0i 10i} 

(0i I < 32 } 

Ql ~ (0i|0i> 


Now proceed by induction. 


Suppose we have constructed k mutually orthogonal vectors |0j}, i = 1,2,... ,k. 
If we let 

k 

|0fc+i} = |<3fc+i} + E a i I0i) ( 4 - 55 ) 

3= 1 


with 


(0j I <3fc+l} 
(0J I 0j} 


(4.56) 
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then we have {/3j \ Pk+i) = 0 for j = 1,2,..., k. These steps are repeated until we 
have n mutually orthogonal vectors. We then normalize them to 1 and create 
an orthonormal set. 


For example, suppose we have the set 





/0) 


m 

K) = 

i 

W 

l«2> = 

1 

7/ 

l«3> = 

0 

7/ 


These vectors are not orthonormal. 


1. let 


IA} - l a i) 


/1 \ 
1 

w 


(A I A) = 2 


(4.57) 


2. let 

with 


and thus 


1/32) - |c^2} + ai |/3i) 
(/?i|a 2 ) 1 


Or = 


</3i |/3i> 


| / 3 2 > = |a 2 )- 2 |ai)= 2 


/- 1 \ 

1 

\ 2 / 


</3 2 |/3 2 ) = 


</3i |/3 2 > = 0 


3. let 

with 


1/33) - |a:3} + ai |/3i) + a 2 |/3 2 ) 


and thus 


a i = - 


/ 1 \ 


|/3 3 > = 


-1 

V 1 / 


(A I <* 3 ) = _1 

</3i]/3i) ' 2 


</3 3 |/3 3 > = 


a 2 = - 


(A 1 < 33 ) _ _ 1 

(/3 2 | /3 2 ) 3 


(/3i I A} = 0 (/3 2 | /3 3 ) = 0 


We normalize the vectors by dividing by their respective norms, 

m i ft) 


l7i) = 

The orthonormal set is then 

' 1 \ 


IA)II I (A I A) 1 1/2 


l7i) = 


1 

71 


i 

( 0 / 


172 ) = 


76 


l- 1 \ 
1 

V 2 / 


173 > = 


73 


1 

-1 

U 
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4.8. Linear Operators 


In general, an operator defined on a vector space maps vectors into vectors 
Q : V V, i.e., if Q = operator and | k) = a vector, then Q |fc) = \q) = a vector(we 
will use the ~ symbol on top to signify operators). We say that | q) is the image 
of | k) under the mapping Q. An operator is fully defined if we specify its action 
on all vectors in its domain, which is the set of all vectors {|Ar}} for which Q\k) 
is defined. Its range is the set of all vectors of the form Q\k) for all |fc). 

In the special case of a linear operator, the operator also satisfies the linearity 
relation 

Q(ai |fci) + o 2 | k 2 )) = Q(ai |fci» + Q(a 2 \k 2 }) (4.58) 

In general, the operator A is linear if, for all |/), | g) e V, A(\f)+\g)) = A \ f)+A \g), 
(aA) |/) = ( aA\f) and A = 0 if and only if A\f) = 0 for all |/). The norm ||A|| 
of the linear operator A is defined as the maximum of ||A|/} || for all |/) such 
that j| |/) | < 1 . The identity operator / is defined by I\f) = |/) for all |/). 

Since any vector in the space can be written as a linear combination of a basis 
set of vectors, we need only define a linear operator on the basis set and then 
its operation on all vectors is known. 

Quantum mechanics works exclusively with linear operators so we will just call 
them operators from now on. 

When are two operators equal? 

In general, we say that two operators are equal, i.e., Q i = Q 2 , when Q 1 1 p) = 
Q 2 | p) for all vectors | p) in the intersection of the domains of Q\ and Ca¬ 
using this equality idea we can define the sum and product of operators by 


{Qi + Q 2 )\k) - Qi\k) + Q 2 \k) (4.59) 

(Q1Q2) |fc) = Qi(Q 2 |fc}) (4.60) 

for all vectors |fc). It is clear that these relations imply that 

(Q1Q2Q3) \k) = (QiQ 2)Q3 \k) = Qi(Q 2 Q 3 ) 1^) (4-61) 

or 

(QiQ 2 )Q 3 = 0i(4Q 3 ) (4-62) 

which corresponds to associativity. Now 

(Q1Q2) | k) = Qi(Q 2 ) |fc» = Qi \k 2 ) = \ki 2 ) (4.63) 

(Q2Q1) | k) = Q 2 (Qi) I k)) = Q 2 \h) = \k 21 ) (4.64) 
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does not imply that \Ic12} = |/c 2 i)- Therefore, in general we have Q1Q2 + Q2Q1, 
which corresponds to the noncommutivity of the two operators. 

We define the commutator of two operators by 

[Qi, Q2] = Q1Q2 ~ Q2Q1 (4.65) 

Two operators are said to commute if their corresponding commutator is zero 
(the null vector). 

The commutator will be a fundamental quantity in our discussions of quantum 
mechanics. 


4.9. An Aside: The Connection to Matrices 

Now suppose that we have an TV-dimensional vector space and we choose an 
orthonormal basis set 

{\qi) >* = 1)2, • ,TV) (4.66) 

In an TV-dimensional space, we can always represent any vector as an TV-element 
column of numbers or as a column vector. 


If we expand some arbitrary state |a) in terms of this basis set and operate on 
it with an operator Q, then we have 

n n 

|/3) = Q\a) = QY, c i Wi) = ( 4 -67) 

i =1 i =1 

where \/3) is another vector in the same space since Q is a linear operator. 


Now expanding the vector \/3) in the same basis set as 

N 

l/?) = I>l®> (4.68) 

i= 1 

and constructing the linear functional (inner product) of \(3) = Q |a) with respect 
to | qi) we get 

N N N 

i<lk\Q\a) = (q k \QY c ih) = E c * (<lk\Q\qi) = {qk 1/3} = Y, d 3 ( Qk\qj ) (4.69) 

i =1 i =1 j =1 

Using the orthonormality property of the basis set 

(Qk\Qj) = Skj (4.70) 


we get 


N 

{Qk \ Q\Qi) = djSkj = dk 

3 =1 


(4.71) 
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(4.72) 


If we define the numbers Qki = {qk\ Q Vh) we get the relation 

N 

QkiPi — dk 

0 = 1 

Clearly, this has the form of a matrix equation where Qki = {qk\ Q Wi) is defined 
as the (hi)- matrix element of the operator Q. In this case, we would have the 
representation 

( Qij ) = N x N matrix (4.73) 

(Ci ) = N x 1 column matrix (4.74) 

( di ) = N x 1 column matrix (4-75) 

and the matrix equation (using matrix multiplication) represented above is Qc = 

d. 

All finite dimensional vector space operator equations can be turned into matrix 
equations in this manner. 


4.10. More about Vectors, Linear Functionals, Op¬ 
erators 

We have discussed what happens when an operator acts to the right on a ket 
vector. What about acting to the left on a bra vector, i.e., what is the meaning 
of the quantity 

(q\Q (4.76) 

Since the bra vector is really a linear functional we must be careful and look 
very closely at the mathematical content of the above quantity. Remember that 
the definition of the linear functional was 

(q\p) = r q (p) = (q,p) (4-77) 


The standard definition (in terms of inner products) of the adjoint operator Q', 
of the operator Q , in a linear vector space is 


(Q f g,p) = ( q,Qp) 


(4.78) 

The adjoint obeys the following properties: 



£u> 

ii 

ii 

+ 

B) f = i t + R t 

(4.79) 

IT 

ii 

p 

3' 

n 

(i- i ) t = (i t r i 

(4.80) 


If we define a new vector (f> by the operation Q^q = (f>, then using the definition 
of the adjoint operator we have 


r 4 ,{p) = (0,p) = {Q ] q,p) = (q, Qp) = r q (Qp) (4.81) 
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We want the action of an operator on the bra space of linear functionals or bra 
vectors to create a new linear functional or bra vector in the same way that 
it does for ket vectors. We can guarantee this by defining the operation of an 
operator on the bra space of linear functionals by 

QT q (p) = r,(Qp) (4.82) 

Using this new definition and the definition of a linear functional, we then have 

QT q (dpi + C 2 P 2 ) = r g (Q(cipi + C 2 P 2 )) (4.83) 

= c 1 Y q (Qp 1 )+C 2 T q (Qp 2 ) (4.84) 

= c 1 QT q (p 1 )+c 2 QT q (p 2 ) (4.85) 

which says that QT q (.. .) itself is also a linear functional. 

Thus, our definition of how an operator acts on the bra space of linear functionals 
simply says that it creates a new functional as we desired. 

With this definition, we then have 

QT q (p) = T 0 (p) = (cj),p) = {Q ] q,p) = (q, Qp) = T q (Qp ) (4.86) 

or, since (Q^q,p) = T^^p), the relationship among linear functionals is 

gr,(...) = r <3t ,( P ) (4.87) 

In terms of bra and ket vectors, this says that if (q\ and |g) are corresponding 
bra and ket vectors (remember, the Riesz theorem says there is a unique bra 
vector for each ket vector), then 

Q\q) = 1/3) and (g|Q t = (/3| (4.88) 

should also be emphcorresponding bra and ket vectors. 

Since {(3\p)* = {p\/3) we then have that 

(q\&\p)* = (p\Qk) (4-89) 

for all p and q. This relation is equivalent to the original inner product relation 

(QU,p) = (q,Qp) (4-90) 

The end result of this discussion is that (g| Q t = (j3\ is the bra vector(linear func¬ 
tional) corresponding to the ket vector Q \q) = |/3). Since the adjoint operator 
satisfies the relations 

(cQ) t = c*Q t (Qi?) t = f? t Q t (Q + f?) t = Q t + f? t (4.91) 
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we can define another product among the bra and ket vectors, which is an 
operator rather than a scalar as in the case of the inner product. It is called the 
outer product and, in the Dirac language, has the form 

\q)(p\ (4.92) 

It is clear that the outer product is an operator since its action on any other 
vector always results in a vector 

(k){p\)\s) = \q){p\s) (4.93) 

We also have that 

«<?l(k) (pI) 1 »)* = (p\(\q)(p\)\q) = (p\v)(p\q) = (( q\p)(q\p))* (4.94) 

or 

((q\(\q)(p\V\p)) = (p\q)(p\q) (4-95) 

which implies that 

(k) (Pl) f = k) (P\ (4-96) 

This type of operator will be very important in our later discussions. The special 
case 

\p)(p\ (4.97) 

is called a projection operator because it picks out the component of an arbi¬ 
trary state vector in the "direction" of the vector | p). 

Projections operators (and linear combinations) will be the mathematical ob¬ 
ject with a direct connection to physical states and measurements in our later 
discussions. 


Example in a Finite Dimensional Vector Space 


Let us consider the 2-dimensional vector space spanned by the orthonormal 
basis set 

I 1 ) = (J) |2> = (J) (4.98) 

We can define two projection operators as 

A = |1)(1| A = |2) (2| (4.99) 


The matrix representation of these two projection operator is easily found using 
(111} = (21 2} = 1 and (11 2) = (211) = 0 and Qki - (fc| Q |i). We have 


p /(i|A|i> <i|A|2>\ 

1 U T(2|A|1) (2| A |2)/ 

/< 1 | 1 )< 1 | 1 > ( 1 | 1 >( 1 | 2 )\ _ ll 0 \ 
\(211 > (111 > (211 ) (112 ) j _ yO 0 ) 


(4.100) 
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(4.101) 


~ /<1|A|1) (1|A|2)\ 

1 \(2| A |1) (2|A|2)j 

_ /<! I 2 >( 211 ) <11 2 }< 212 )\ (0 0 \ 

~ \(21 2 > (2 11 ) (21 2 ) (212 ) j _ \0 l) 

Now consider an arbitrary vector in this space 

l fl> = (aa) =« 1 |1) + «2|2) (4.102) 

We then have (using both Dirac and matrix languages) 

A | a) = ai |1) (111) + a 2 |1> (112) = 0l |1) (4.103) 


and the projection operator performs as advertised. 


We note that (at least in this special case ) 


(A) + (A) 



= I = identity operator 


or 

(A) + (A) |a) = ((111) + (212)) |a> 

= Z lj> O'I a) = |a) = i>) 

j =i 


(4.104) 


(4.105) 


(4.106) 


where we have made use of the expansion formula for an arbitrary state in an 
orthonormal basis. 

Return to Gram-Schmidt 


As before, suppose we have a set of linearly independent, but non-orthogonal 
vectors |i) in an n-dimensional linear vector space, we can construct a set of 
orthogonal vectors |ck,) as follows: 


where 

Then 


K) = |l> |a 2 ) = |2 )-|ai)(ai|2) 

(4.107) 

|ai) (au| = P a i = projection operator 

(4.108) 

M = |2 }~P ai |2) 

(4.109) 


260 



As earlier, the fact that this type of operator is a projection operator is shown 
clearly by considering its effect on an arbitrary vector 

IQ) = 9 i |1) + Q 2 | 2 ) + <73 |3) + • • • = ^ qi |*) (4.110) 

i 

Then using 

P an = \a n )(a n \ (4.111) 

we get 

^otn \Q) ~ Y,<li p <*n N) — Qi l^n) (^n | 0 — 'y ^ Qi \®-n) &ni ~ Qn l^n) (4.112) 

i i i 

or 

^ |a n ) (a„| = I = identity operator (4.113) 

n 

which makes perfect sense since if you project out all components of a vector 
you just get the vector back! This generalizes the earlier result we found in the 
2-dimensional space. This Dirac language is very interesting! 

We can then write a general Gram-Schmidt procedure as 

K) = |l> |a 2 ) = (l-Pa 1 )|2) |a 3 ) = (l-P ai -Pa 2 )|3) 



which is rather neat. This shows the power of these projection operators and of 
the Dirac language. 

Important Point - Looking at the equations 

Q\q} = \0) and (q\& = {/3\ (4.114) 

we might be tempted(as is the case in many textbooks) at this point to write 

(l«» f = (9l (4-115) 

This might seem to make sense in a finite dimensional vector space where we 
can always treat kets as column vectors and bras as row vectors. However, the 
adjoint symbol is really only defined for operators and not for vectors, so one 
should exercise great care before generalizing this result, especially in infinite 
dimensional spaces! 

Finally, we define a property of an operator called the trace as 

TrQ= = sum of diagonal elements = (4.116) 

j 3 
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4.11. Self-Adjoint Operators 

If we have an operator Q where the adjoint satisfies the relation 

<«IQ + |p> = «p|Qk>r (4-117) 

then Q is a Hermitian or self-adjoint operator. 

In a finite-dimensional space, we can look at the matrix elements corresponding 
to a Hermitian operator. Let 


N 


Ip) = h) 




We have 

{Qk\Q\p)* = (p|Qkfc) 

(qk\Q E k*)* = E a i <9*1 <5kfc) 

2=1 2=1 

AT AT 

E a i <9fcl Q h) = E °i <9»l <2 kfc) 

2=1 2=1 

AT 

E a **[(9fclQl9*)* - (*lOkfe)] = o 

*=i 

where we have used the antilinear property. 


(4.118) 

(4.119) 

(4.120) 

(4.121) 

(4.122) 


This says that the matrix elements of a Hermitian operator satisfy 

Q* ki = Qik or Q ] ' = Q (4.123) 


as we saw above. 

If H. K are Hermitian, then so are H + K, i\H,K], and aH for all real a, and 
A 1 HA for any operator A, but HK is Hermitian if and only if the commutator 

[H,k] = 0. 

The Hermitian operator H is positive if ( f,Hf ) = (f\H\f) > 0 for all |/); note 
that if the equality holds for all |/}, then H = 0. If H is positive, then it obeys 
the inequality 

\(f,Hg)\ 2 >(f,Hf)(g,Hg) (4.124) 

called the Schwarz inequality for positive operators. 

If K' - —K, then the operator K is called antihermitian', clearly, iK is Hermi¬ 
tian. An arbitrary operator A can always be written as the sum A = Ha + Ka 
of its Hermitian part 

H a =^(A + A*) (4.125) 
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and its antihermitian part 


K a = ^(A-A') (4.126) 

Hermitian operators will be very important in our discussions because they will 
represent measurable quantities or observables. 


4.12. Eigenvalues and Eigenvectors 

Suppose that an operator Q acting on a particular vector |/3) returns a scalar 
multiple of that vector 

Q 1/3} = b\/3) (4.127) 

The vector |/1) is then called an eigenvector and the scalar b an eigenvalue 
of the operator Q. Using the definition of the adjoint operator and the 
antilinear correspondence between bras and kets, we then also must have (using 
any arbitrary vector |qi}) 

( a \Q\P)* - b* (a|/3>* (4.128) 

(/3| |ct) = b* ((3 \ a) (4.129) 

(/3|QW</3| (4-130) 

Now we assume that Q is a Hermitian operator and that Q\/3) = b\/3) again. 
The Hermitian property says that 

(I3\&\P) = W\Q\P) = (P\Q\P}* (4.131) 

(P\b\{3) = (f3\b\/3)* (4.132) 

(b-b*){0 1/3)=0 (4.133) 

b = b* (4.134) 

where we have assumed that {/3\/3) + 0. This means that all of the eigenvalues 
of a Hermitian operator are real. Following this up, if Q is a Hermitian operator 
which satisfies 

Q\(3) = b\(3) with Q = (4.135) 

then 

{P\Q = b{0 1 (4.136) 

or the ket vector and its corresponding bra vector are eigenvectors with the 
same eigenvalue. 

Suppose that we have two eigenvectors |a) and |/3) of a Hermitian operator Q 
with eigenvalues a and b , respectively. We then have 

Q|a) = a|a) and Q\P) = b\j3) (4.137) 
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o = (a\ Q 1/3} - </3| Qt | a >* = (a| Q \(3) - (/3| Q \ a )* (4.138) 

0 = b (a \ (3) - a* (/3 \ a)* = (b - a) (a \ (3) (4.139) 

This implies that eigenvectors corresponding to distinct (different) eigenvalues 
(a + b ) are orthogonal, i.e., (cx\(3) = 0. 

If a = b, that is, the two eigenvectors correspond to the same eigenvalue, then 
they are called degenerate eigenvectors. In this case, we have that 

Q|a) = a|a) and Q\(3) = a\f3) (4.140) 

Now, any linear combination of the degenerate eigenvectors is also an eigenvector 
with the same eigenvalue as can be seen below 

Q(ci |a) + c 2 I/?)) = cxQ |a) + c 2 Q \/3) (4.141) 

= Cia |a) + c 2 a |/3) = a(ci |a) + c 2 |/3}) 

It is, therefore, always possible to replace a nonorthogonal but linearly indepen¬ 
dent set of degenerate eigenvectors by linear combinations of themselves that 
are orthogonal (using the Gram-Schmidt process). For the case of two states 
above, the orthogonal set is easy to find, namely 

|l> = k*> + |/3> |2> = |a) -1/3> (4.142) 

The number of distinct vectors corresponding to a given eigenvalue is called the 
multiplicity of that eigenvalue. Non-degenerate eigenvalues are called simple. 

We will always assume that we have already carried out this orthogonalization 
process and that the set of eigenvectors (for both non-degenerate and degenerate 
eigenvalues) of any Hermitian operator is an orthogonal set. 

If the norms are all finite, then we can always construct an orthonormal set 
where 

{ai\aj) = Sij (4.143) 

The set of eigenvalues of an operator A is called the spectrum of A. 

In an m-dimensional space H choose a matrix representation in which A = 
( Aij),g = ( 7 ,;). Written in this representation, the eigenvalue equation Ag = A g 
becomes a system of m homogeneous linear equations in the 7 j. The eigenvalues 
A are the roots of the algebraic equation of degree m 

det(Aij - XSij) = 0 (4.144) 

known as the secular equation, which expresses the condition for the homoge¬ 
neous system of equations to have non-trivial solutions. In an m-dimensional 
space every operator has at least one and at most m distinct eigenvalues. If 
all the eigenvalues of an arbitrary operator A are simple, then there are m lin¬ 
early independent eigenvectors of A but there may be fewer if A has multiple 
eigenvalues. 
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4.13. Completeness 


A set of orthonormal vectors {| ak) ,k = 1,2,3,..., AT) is complete if we can 
expand an arbitrary vector \rj) in terms of that set, i.e., 

\v) = Y a i\ a i) (4.145) 

3 

The orthonormality condition then allows us to calculate the expansion coeffi¬ 
cients cij as 

{otk | if) = E a o ( a *l a j) = Y a J S kj = a k (4.146) 

3 3 

This implies that (remember the 2-dimensional example earlier) 

\v) = EI a i) ( a j I v) = |e K) ( a j\ j \ 7 l) (4.147) 

or 

E | aj) (ctj \ = / = identity operator (4.148) 

3 

This result is quite general. 

For any complete set of vectors {| q^) , k = 1, 2,3,..., N} the sum over all of the 
projection operators \qk) {<lk\ is the identity operator. This is one of the most 
important results we will derive. It will enable us to perform very clever tricks 
during algebraic manipulations. It is fundamentally linked to the probability 
interpretation of quantum mechanics as we shall see later. 

If a set of vectors {\qk) , k = 1, 2,3,..., N} are eigenvectors of a Hermitian oper¬ 
ator Q, then we will always assume that they are a complete set. Thus, if 

Q\qk) = qk\qk) (4.149) 

then we can always write 

Q = QI = Q E \lj) (Qj\ = E I Qj) (lj\ (4.150) 

3 3 

which completely expresses an operator in terms of its eigenvectors and eigen¬ 
values. 

Since we can use the same argument to show that 

Q n = Q n i = Q" E h) tol = E h) toil (4-151) 

3 3 

we then have for any function f(x) that has a power series expansion of the 
form 

f(x) = Y°kX k (4.152) 

k 
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that 


HQ) = HQ)i = HQ) E h)(Qj I = ZZ c kQ k h) (<h\ 

3 3 k 

= T,T, c k < lj kj><9jl = 'Zf(Qj)\<lj)(Qj\ (4.153) 

3 k j 

For a finite dimensional space, it can be shown that the eigenvectors of a Her- 
mitian operator always form a complete set and all of the above relations hold. 
For infinite dimensional spaces the proof is not easily carried out in general (it 
is usually true however). 

Before proceeding to a general discussion of the eigenvalue spectrum, the spec¬ 
tral theorem and the problem of continuous eigenvalues let us expand our knowl¬ 
edge of some of the objects we have already defined and add a few definitions 
that will be useful. 


4.14. Expand Our Knowledge - Selected Topics 

4.14.1. Hilbert Space 

In an infinite dimensional space, we must determine whether the sums involved 
in many of our definitions converge. If they do not converge in some expressions, 
then the corresponding definitions are not valid. In addition, we must clearly 
state what is meant by a linear combination of an infinite number of vectors? 

We assume that an infinite linear combination 

oo 

\ a ) = Yj a k \Qk) (4.154) 

k= 1 

is defined if the sequence of partial sums 

n 

|a n ) = Yj a k \Qk) (4.155) 

k= 1 

converges as n -*■ oo or, equivalently, that \a n ) -*■ |a) as n -» oo, where this 
convergence is defined by the norm relation 

|| |a) - \a n ) | -*■ 0 as n -* oo (4.156) 

The vector |a) is called the limit vector of the sequence. A sequence of vectors 
| a n ) is called a Cauchy sequence if 

II \oi m ) ~ | oi n ) | -*■ 0 as m,n -* oo (4.157) 

A space is complete if every Cauchy sequence of vectors converges to a limit 
vector in the space. 
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If a linear space with an inner product defined is complete, then it is called a 
Hilbert space. 

A Hilbert space is separable if it has an orthonormal basis consisting of a count¬ 
able (finite or infinite) number of vectors. Note that every finite-dimensional 
space is complete (as we assumed earlier). 

If a set of n vectors {|(?fc} is such that every sequence of vectors in the set 

n 

\ a n) = Yj a k kfc) (4.158) 

k= 1 

has a limit vector also in the set, then the set is closed. 

Some examples are: 

Space i 2 : this is the space of infinite sequences 

(x 1 ,x 2 ,x 3 , .) 


such that 

oo 

Y \xk\ 2 is finite 
k= 1 

This space is a separable Hilbert space. It has an orthonormal basis consisting 

of 

!<?!> = ( 1 , 0 ,0,0,...) \q 2 ) = (0,1,0,0,...) |g 3 > = (o,o, i,o,...) 


since 

OO 

(xi,ff 2 ,X 3 ,.) = Y X k\Qk) 

k =1 

Space L 2 (a,b ): this is the space of square integrable functions on the interval 
( a,b ). It is a separable Hilbert space. If we choose the interval (0,1), then 
we have the space L 2 (0, 1 ) of square integrable functions f(x) on the interval 
0 < x < 1. 

Another example comes from the theory of Fourier series, which says that the 
set of orthonormal vectors (or functions in this case) 

1 , \[2 cos 2nkx and sin 2-Kkx for k = 1 , 2 ,3,... 

form an orthonormal basis. 

If a set of vectors within a larger set is closed and both the larger set and the 
smaller set both form linear vector spaces, the smaller set is called a subspace 
of the larger set. 
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A subspace of a separable Hilbert space is also a separable Hilbert space. 

If 1Z is a subspace of a separable Hilbert space, then the set of all vectors which 
are orthogonal to every vector in 1Z is called the orthogonal complement 1Z L of 
1Z. TZ ± is also a subspace. 

A Hilbert space preserves the one-to-one correspondence between the kets in 
the space and the bras or linear functionals in its dual space. 

4.14.2. Bounded Operators 

A linear operator Q is continuous if 

Q|o„)-Q|a) (4.159) 

for any sequence of vectors |a„) which converge to the limit vector |a). 

A linear operator is bounded if there exists a positive number a such that 

IQ |o;) | < a| |a) | (4.160) 

for every vector |a) in the space. The smallest a is called the norm ||Q|| of Q. 

A linear operator is continuous if and only if it is bounded. Every operator 
on a finite-dimensional space is bounded. A bounded linear operator on an 
infinite-dimensional space can be represented by an infinite matrix. 

4.14.3. Inverses 

A linear operator Q has an inverse if there exists a linear operator M such that 

MQ = / = QM (4.161) 

We denote the inverse of Q by M = Q -1 . 

In an n-dimensional vector space with a basis set {| qk),k = 1,2,3, ...,n) , a 
necessary and sufficient condition for a linear operator Q to have an inverse is 
any one of the following (all are equivalent): 

1 . there is no vector |y) (except null vector) such that Q\x) - 0. 

2. the set of vectors {Q \qk) , k = 1,2,3,..., n } is linearly independent. 

3. there exists a linear operator M such that MQ = / = QM 

4. the matrix corresponding to Q has a nonzero determinant. 
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We defined the matrix elements with respect to a basis set by Qij - (</,) Q \<1 3 )- 
The determinant of a matrix is defined by 

det(Q') — €j ll Qii lQi 2 2 * * * Qi n n (4.162) 


where e% x i 2 , n is the permutation symbol of order n (n indices) which is defined 

by 


£iii 2 ...in 


+ 1 if ?’i ?2 ... i n is an even permutation of 123 ... n 
■ —1 if ?’i *2 ... i n is an odd permutation of 123 ... n 
0 if any index is repeated 


(4.163) 


Example: in the 3x3 case: 



( A-w 

Al2 

4i3 < ' 

3 

det 

A 2 i 

^32 

4-23 

“ tijk-A-nA.j2-A.k3 


^4 3 i 

A 32 

433 ; 

i,j,k=l 


= C 123^411 ^4.22 ^33 + C132 ^411 ^32 ^23 + C213^421^4l22433 
+ ^231^21^32^413 + C312 ^431^112 ^23 + C32l43i A 22 A 13 
~ ^4ll4l22^433 _ ^411^32^423 “ ^421^412^33 
+ 7421^32^3 + ^431^4i 2 ^4 2 3 _ 4 31 A224 13 


where we have only included the nonzero terms (no repeated indices). 


We note that Rules (1), (2), (3) are not sufficient conditions in the infinite- 
dimensional case. 


Finally, if two linear operators have inverses, then the inverse of their product 
is 


(MN)- 1 = N~ x M~ l 


(4.164) 


4.14.4. Unitary Operators 

A linear operator G is unitary (orthogonal if we have a real linear space) if it 
has an inverse and if ||G|a) || = || |a) || for every vector |a), i.e., unitary operators 
preserve norms or lengths of vectors. 

If G is a unitary operator, then a more general result is 

\at\) = G\ffi) and |« 2 ) = G\^) implies (a\ | a 2 ) = (/?i | fa) (4.165) 

or unitary operators preserve inner products and not just norms. Using the fact 
that (ai| = (/li| G, we then have, for a unitary operator G 

(ai|a 2 ) = (A|G t G'|/3 2 ) or G f G = / or G f = G _1 (4.166) 
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i.e., the inverse is the Hermitian conjugate or adjoint. 

In addition, the action of a unitary operator on a basis set preserves the fact 
that it is a basis set. 

The evolution of quantum systems in time will be given by a unitary operator. 
The inner product preserving property will be connected to the probability 
interpretation of quantum mechanics. 


4.14.5. More on Matrix Representations 

Let {|h;)} be an orthonormal basis on an m-dimensional space. An arbitrary 
vector | g) can be written in terms of its components 7 , = (bi \ g ), i = 1,2,... ,m 
as 

= El* \ b i) (4.167) 

i 

Given an operator 4, the vector | h) = A\g) has the components 

Vi = (bi\(Mg)) = (bi\A\g) = £ 7 j (bi\A\bj) (4.168) 


One can identify the vectors | g) and | h) with column matrices formed by their 
components 


Iff) = 

1 7i ^ 
72 

1 h) = 

( Vi s 
V2 


\lm) 


\Vm) 


(4.169) 


In fact, the vector space of column matrices with the usual definitions of addi¬ 
tion of matrices and multiplication by a number is isomorphic with T~L. With 
these definitions, the operator 4 can be identified with the square matrix whose 
elements are the m 2 numbers 4,y = (bi\A\bj) 



(A n 

A 12 

Aim ^ 


A = 

A-21 

A 22 

A23 m 

(4.170) 


\A m 1 

A m 2 

Amm / 



With these identifications, the components ifo can be rewritten as 


Vi = X - '-A (4-171) 

j 

This is an expression which is identical to the matrix equation 


' Vi ^ 


( An 

4l2 

^lm ^ 


( 7i 

V2 

= 

^21 

422 

^-23 m 


72 

\Vm) 


\4 m i 

4 m 2 

A-mm j 


\7 m) 
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(4.172) 



or more succinctly (rji) = (A,,j)(bj). 

When the above identifications are made we speak of the matrix representation 
of 7~L with respect to the basis {bi}, or simply the { 6 i}-representation of 'H. 

If in a given representation A =>• ( A tJ ) and A* =>• (At-), then At- = A*.-; thus, 
if A is hermitian, Ajj = A* t . The representation of the identity operator with 
respect to any orthonormal basis is the identity matrix (bjj). The inverse A -1 
exists if and only if the determinant det.(Aij) + 0; then AA = cof(Aij)/det(Aij) 
(cofactors). The matrix elements of a unitary operator U satisfy the relation . 

Change of Representation - Let {|&i)}, {\bi)} be two bases on 7~L. What is 
the relationship between the representations of a vector | g) and an operator A 
with respect to the bases {16^)} and {\bi)}7 

Consider first the vector | g) and let 7 i = ( 6 ,; | g) and 7 j = (h, | g) be its components 
along | bi) and b t ) respectively. Since 

IM = Z(^IMl bj) (4-173) 

3 

we have 

7 i = (k 1 9 ) = Y, { b 3 I ( b j 1 9 ) (4.174) 

3 

Defining the matrix S => ( Sij ) = (bj | 6 ,} we can write 7 , = jj). 

The matrix S is unitary in the sense that 

-'V, (4.175) 

k 

Instead of thinking of S' as a matrix performing a change of bases we can think 
of it as a unitary operator that generates a unitary transformation of mathcalH 
onto itself given by the correspondence 

\f) = S\f) A=SASl (4.176) 

For any vectors |/), | g) and any operator A we then have 

(f\9) = (f\9) (f\A\g) = (f\A\g) (4.177) 

4.14.6. Projection Operators 

Suppose we have a vector |a) in a separable Hilbert space. The associated 
projection operator P a is a linear operator in Hilbert space whose action on 
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any other arbitrary vector in the space is to project out the component of that 
arbitrary vector along the vector |a), i.e., 

P a \P) = a\a) (4.178) 

where a is a scalar and \/3) is any arbitrary vector. This is a projection operator 
of the entire Hilbert space onto a one-dimensional subspace , namely, the single 
vector | a). 

Since, by definition, P a \a) = |a), all projection operators satisfy the following 
property: 

P a P a \P) = aP a \a) = a\a) = P a \0) (4.179) 

(Fi-P a )\p) = 0 or P 2 = P Q (4.180) 

If |r) is an eigenvector of P a , such that P a |r) = r\r), then we have 

Pa\r) =rP a \r) = r 2 \r) (4.181) 

(P Q 2 -P Q )|r) = (r 2 -r)|r) = 0 (4.182) 

r 2 - r = 0 r = 0,1 (4.183) 

The eigenvalues of any projection operator are 0,1. 

In general, any operator that satisfies the relation A 2 - A is called idempotent 
and has eigenvalues 0,1. 

Two projection operators P ai and P Q2 are orthogonal if, for any arbitrary vector 
| beta) 

\rj) = P ai |/?) and \cr) = P Q2 \/3) impliesthat (77 1 cr) = 0 (4.184) 

The scalar constant in the equation P a \/3) = a\a), since it is the component of 
the arbitrary vector along |a), is related to the inner product of the arbitrary 
vector with |a) by a = (a\/3), which implies that 

P a \p) = {a\p)\a) = (\a)(a\)\p) (4.185) 

or 

P a = \a)(a\ (4.186) 

For an orthonormal basis set {|<7i)}, where (qi \qj) = Sij, we can define a set of 

projection operators {Pi}, where each projection operator is given by 

Pk = \qk){q k \ (4.187) 


We then have 


PA = SijPj or \qi){qi)\qj{qj\ = S ij \q j ){q j \ (4.188) 
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so the projection operators are mutually orthogonal in this case. 


As we stated earlier, the set of projection operators satisfies a completeness 
property, i.e., for any vector we can write 

IV’) = E(ft'lV’)lft) = El*) (ft |V>} = E^IV’) = (SA) |V') (4.189) 

k k k \ k / 

This implies 

Z4 = / = Ek->(ft| (4.190) 

k k 

This relation is very powerful and allows us to easily do algebra using the Dirac 
language by judicious insertion of / operators. 

Some examples are: 

1 . 

(a | /?) = (a| 1 1/3) = ^ ( a \ Oft) (ftl) 1/3) = E ( a I 9k) (ft I /?) 

k k 

2 . 

(a\Q\(3) = (a\iQi\(3) 

= (“I |E I 9k) (ftl j Q 1 9j) (9j\ j 1/3) 

= EEHft)(ftl<9lft)(ftl/3) 

k j 

3. 

Q 1/3) = IQ 1/3) = (e |ft) (ftl) Q 1/3) = E (ftl Q 1/3) Ift) 

V fc / fc 

4. 

(a| Qf? |/3) = (a|/Q/i?/|/3) 

= H |E I*) (ftl j Q |E I*) (ftl j 3? |e \9j) (ftl j 1/3) 

= E E E (« 1 9k) (ftl q I ft) (ftl £|ft> (ft l /3> 

k j i 

4.14.7. Unbounded Operators 

Many of the operators we shall deal with in quantum mechanics are not bounded. 
An example is easily constructed in the space L 2 (- 00 , 00 ) of square-integrable 
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functions f(x) for -oo < x < oo. Let I be a linear operator defined by the 
equation 

Xf(x) = xf(x) (4.191) 

where x is real. This is a Hermitian operator since 


(<?> Xf) = f g*(x)Xf(x))dx= f g*(x)xf(x) dx = f xg* (x)f(x)) dx 

J — oo J — oo J — oo 

= [ [xg(x)]*f(x))dx= f [Xg(x)]*f(x))dx = (Xg,f) (4.192) 

J —oo J — oo 


provided that all the integrals converge. It is not bounded since 


/ oo 

\xf{x)\ 2 )dx 

oo 


(4.193) 


is not necessarily finite even if 


ii/ii 2 =/; 


l/( a: )| 2 ) dx < oo 


(4.194) 


4.15. Eigenvalues and Eigenvectors of Unitary Op¬ 
erators 

We need to cover a few more details in this area. If U is a linear operator which 
has an inverse C/ -1 , then the operators UQU^ 1 and Q have the same eigenvalues, 
that is, if Q\a) = a\a), then 

UQU~ 1 (U\a)) = UQ\a) = aU\a) (4.195) 

which says that UQU -1 has the same eigenvalues as Q and its eigenvectors are 
U\a). 

The eigenvectors and eigenvalues of unitary operators have these properties: 

1. the eigenvalues are complex numbers of absolute value one 

2. two eigenvectors are orthogonal if they correspond to different eigenvalues 

3. in a complex finite-dimensional space, the eigenvectors span the space 


4.16. Eigenvalue Decomposition 

Let us now look deeper into the representation of operators in terms of their 
eigenvalues and eigenvectors. 

As we stated earlier, for a finite-dimensional vector space, the eigenvectors of a 
Hermitian operator always form an orthonormal basis set or they always span 
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the space. 


Suppose we have a Hermitian operator B with eigenvectors { \b k ), k = 1,2,3,..., n} 
and eigenvalues b)~ where 

B{\h) = h{\b k ) (4.196) 

Labeling the states by the eigenvalues as above will become a standard practice 
as we get into quantum mechanics. 

We showed earlier that we could represent an operator by the expression 

B=Z^j\b j ){b j | (4.197) 

3 

in terms of its eigenvalues and the projection operators constructed from the 
basis set (its eigenvectors). 


4.17. Extension to Infinite-Dimensional Spaces 

We now extend the properties we have been discussing to infinite-dimensional 
spaces. First, we extend the properties of projection operators. The projection 
operators we have been considering are a special case of a more general defini¬ 
tion. In particular, the projection operator we have been discussing P a = |a) (a) 
projects the vector space onto the 1-dimensional subspace spanned by the vector 
|a). 

We extend the definition by defining a projection operator onto larger subspaces. 

Let Em be a projection operator onto the subspace M (not necessarily 1- 
dimensional). This means that for any vector |?y) in the space, there are unique 
vectors \>]} Mi in M 1 which is called the orthogonal complement of M, such that 
we can always write 

Em\v) = \v) m (4.198) 

and 

\v) = \il) m + \v) Ml (4.199) 

for every I 77 ) in the space. The operator P a = |a) (a) is clearly a special case 
where the subspace M contains only one vector, namely, |a). 

The more general projection operators Em satisfy all the same properties listed 
earlier for the single-state projection operators P a . 

If Em is the projection on the n-dimensional subspace N, one can select an 
orthonormal basis {|6,)} on 77, n of whose vectors |&i),..., \b n ) form a basis on 
N. In the corresponding representation En has the n diagonal matrix elements 
(T^iv)fefcj k = 1,..., n equal to one and all the others equal to zero. 
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Example: Given the 2-dimensional space C spanned by the basis 



The projection operator onto the 1-dimensional space A (|6i)) is 

A = IWil = (j o) 

The projection operator onto the 1-dimensional space B (I& 2 )) is 

& = IfcMfcl = (° 

Note that A © B = C, B = A ± , A = B ± . 

Before proceeding further let us look at the properties of projections and expand 
on our earlier discussions. 

Let M c T-L be a subspace and M x its orthogonal complement. Every vector 
| h) can be written in a unique manner as |ft.) = |/) + | g), with |/} 6 M, \g) 6 M 1 . 
|/) is called the orthogonal projection of | h) on M. The linear operator Em 
defined by |/} = Em \h) is called the projection operator on M. Its domain is 
the whole space and its range is the subspace M. We say that Em projects 
on M. In general, an operator which projects on some subspace is called a 
projection operator. An operator E is a projection operator if and only if 
E 2 = E and E' = E. Let Em and be the projections on the subspaces M 
and N respectively. The product EmEn is a projection if and only if both 
operators commute, in which case EmEn projects on the intersection of M and 
N. The sum Em + En is a projection if and only if EmEn = 0 , which means 
that M and N are orthogonal. In that case, Em + En projects on M © N. The 
difference Em - En is a projection operator if and only if EmEn = E jy, which 
is equivalent to M £ TV, i.e., N is a subset of M or N is contained in M. In this 
case I En |/) || < || Em |/) || for all |/) e H and we write En < Em- 

The dimension of a projection operator Em is the dimension of the range M. 

Any two vectors |/), | g) determine the operator |/) (g\ defined by (|/) ((/|) | h) = 
(<? 1^)1/)- We have (|/)(g|)^ = \g)(f\- In particular, |/) (/| is Hermitian and, if 
I/) is normalized, it is the one-dimensional projection whose range is |/}. 

A subspace M is invariant under an operator A if for all vectors |/) e M we 
have A\f) e M. If, furthermore, A\g) s M 1 for all | g) e M 1 , that is, if both 
M and M 1 are invariant under A, then the subspace M is said to reduce the 
operator A. The statements "M reduces A" and "A commutes with Em" are 
equivalent. 
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Let M be invariant under A. With |/) and \g) given as above, we have 

(i f l<?)) f |/» = «</| A) |/» = (g\ (A |/» = < 5 | A I/) = 0 (4.200) 

Therefore, A 11 g) e M 1 or, equivalently, M 1 is invariant under M. 

From this result, one sees immediately that if M is invariant under the hermitian 
operator B, then M reduces B. The same is true if U is unitary because then, 
from 

&\f))\u\g)) = (f\u'u\g) = (f\g) = 0 (4.201) 

we conclude that U\g), being orthogonal to U\f) , must be in M l . If no 
subspaces other than H (the entire space itself) and {0} = 0 reduce every 
member of a set of operators, then the set is called irreducible. It follows that 
a set of operators is irreducible if and only if the only operators that commute 
with every member of the set are multiples of the identity. If there is a subspace 
that reduces every operator of the set, then the set is said to be reducible. Now 
let IT be a hermitian operator with r distinct eigenvalues A,,i = 1,... , r. In 
addition, let Mj and E i: be the eigensubspace and eigenprojection belonging to 
the eigenvalue Aj. Mi is invariant under H and reduces W or, equivalently, 
E % IT = WEi. The subspace M = spanned by all the eigenvectors of W 

also reduces IT and the corresponding projection operator 

E = Y,Ei (4.202) 

commutes with IT. 

This result has the following important consequence: the eigenvectors of a Her¬ 
mitian operator span the whole space, that is, E = I or M = entire vector space, 
which generalizes the ideas we found earlier. 

Regarded as an operator on Mj, the operator IT multiplies every vector by the 
number Aj. Therefore, it is equal to A,/ and the multiplicity Aj equals its de¬ 
generacy. One can write IT as the direct sum ©jAj/,: or equivalently, as the 
sum 

£ A iEi (4.203) 

i 

in terms of the eigenprojections on the vector space. 

Collecting together all these results, we have the simplest form of the spectral 
theorem in a unitary space. 

To every Hermitian operator IT on an m-dimensional unitary space there cor¬ 
responds a unique family of non-zero projection operators, the eigenprojections 
of the space, E t , i = 1,..., r, r < m with the following properties: 

1. The projections £) are pairwise orthogonal 

EiEj = EAj (4.204) 
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2. The family of projections is complete 


Y,Ei = I (4-205) 


3. There exists a unique set of r distinct real numbers Athe eigenvalues of 
IT, such that 

IT = E A A (4.206) 

i=l 

This expression is the spectral resolution of IT. 

The range of £) is the eigensubspace M, : belonging to A i. Its dimension is the 
degeneracy or multiplicity Sj of A.^. On it one can construct an orthonormal 
basis {|5[}, r = l,...,Sj. It follows from the completeness of the family Ei 
that the union of all those bases {| b\) , r = 1,..., s*, i = 1 ,m constitutes an 
orthonormal basis on the vector space One often expresses this fact by saying 
that IT possesses a complete set of eigenvectors. 

It is an immediate consequence of (1) and (2) above that for every vector |/} 
we have (/| Ei \ f) > 0 and 

E(/I^l/) = 1 (4.207) 

i 

that is, for each vector and each family of projections, the set of numbers P t = 
{f\Ei |/) constitutes a so-called discrete probability distribution. This will be of 
fundamental importance in quantum theory as we shall see later. 

Two hermitian operators 


Z^A and E&^ (4.208) 


commute if and only if every pair of their eigenprojections commute. 

From the orthogonality of the projections in the spectral resolution of IT we 
have for every normalized vector |/) 

\\w\f) II 2 = (/| IT 2 I/) = E A 2 (/| Ei I/} < A 2 m (4.209) 

i 

where A m is the largest eigenvalue and the equality holds for |/) in the range of 
E m . It follows that the norm of IT is |A m |. 

Functions of a Hermitian Operator - Using the spectral resolution of a 
Hermitian operator 

Q = E A iEi (4.210) 

i= 1 
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one can verify that for a non-negative integer n one has 

Q n = f i X?E x ( 4 . 211 ) 

2=1 

(also valid for negative n if all A; are non-zero). This property suggests the 
following definition of a function of a Hermitian operator. 


If F(x ) is a complex-valued function of the real variable x, then 
the function F(Q) of the Hermitian operator Q is the operator 

F(Q) = Y j f (^)E, ( 4 . 212 ) 


Now, as we saw above, for every vector |/} the family of projections E t belonging 
to a Hermitian operator determines a probability distribution on a discrete finite 
sample space with the probabilities given by Pi = (f\Ei |/}. 

In probability theory, the passage to a discrete but infinite sample space offers 
no difficulty; the sums that yield averages are simply replaced by convergent 
series. The difficulties appear in the case where the sample space is continuous. 
One cannot then construct the probabilities for every possible set of outcomes 
from the probabilities of the points of the space (it is in general zero). Instead, 
one must consider the probabilities of appropriate elementary sets which, in the 
one-dimensional sample space in which we are interested here, can be taken to 
be intervals of the real line. To discuss such a case it is convenient to introduce 
the probability distribution function D(a) defined as the probability that the 
outcome of a trial is less than a. 

In the discrete case, let Pi be the probability that the outcome of a trial yields 
the value and let us order the indices in such a way ^ for i < j. Written 
in terms of the Pi the distribution function is 

Ot 

D{a) - Y, p i ( 4 . 213 ) 

i =—oo 

where i a is the largest index such that < a (we are assuming that the sample 
space is infinite, but the finite case is included if we put the appropriate Pi = 
0). The function D(a) is clearly a non-decreasing ladder function with the 
properties D(- oo) = 0, D( oo) = 1. We have chosen the upper limit in the 
summation so as to satisfy the convention that D(a) is continuous on the right. 

One can imagine a pure continuous sample space as the limit of a discrete 
one when the differences £ i+ i - tend to zero. The distribution function then 
becomes the limit of a ladder function with the properties just mentioned. It 
is, therefore, a continuous non-decreasing function D(a) such that D(- oo) = 
0, D( oo) = 1 . In the case where D(a ) is everywhere differentiable, as is the case 
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in most physical problems, the probability density 7 r(a) is defined by D'(a ) = 
dD(a)/da and the average of the random variable f(a) is given by 

(/(«)) = / f(a)Tr(a) da (4.214) 

In the general case, the sample space is continuous but includes non-zero proba¬ 
bility Pj concentrated at a countable number of points ctj. In this case D(a) will 
have a countable number of discontinuities of size Pj. If D(a ) is differentiable 
everywhere else and 7r(a) is the corresponding probability density, then we have 

(/(«))= f f{a)^{a)da+Y J f{oii)Pi (4.215) 

J-oo j 

Alternatively, using the Dirac ^-function we can write 

(/(«))= f f(a)ir d (a)da (4.216) 

J-oo 

where Tr d (a) is the derivative of D(a), that is, 

n d (a) = 7r(a) + £ PiS(a - aj) (4.217) 

i 

The average can also be conveniently written by using the Stieltjes integral 
defined for a function g(a ) with a countable number of discontinuities by 

r b n 

/ /(«) c?5(a) = E/( a *)[ff(«i)-.9(ai-i)] (4.218) 

Ja i=i 

where the a*, (a = ao < ol\ < «2 < • • • < oc n = b) form a subdivision of the interval 

(a, b ) and the limit is taken over a sequence of ever finer subdivisions. In terms 

of the Stieltjes integral we then have 

X oo 

f{a)dD{a) (4.219) 

oo 

Now let us consider a Hermitian operator B in an infinite-dimensional space. We 
label its discrete eigenvalues in order of increasing eigenvalue where we assume 

bi < b -2 < 63 < ... < b m -1 < b m and B \bj) = bj \bj) (4.220) 

For each real number x we define the operator 

4=£^=E \b : )(bj\ (4.221) 

bj<x bj<x 

With this definition, E x is the projection operator onto the subspace spanned 
by all eigenvectors with eigenvalues bk < x. If x < bi (the smallest eigenvalue), 
then E x is zero (no terms in the sum) and if x > b m (the largest eigenvalue), 
then E x = I because of the completeness property 

m 

£ P k = I (4.222) 

k= 1 
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E x increases from zero to one as x increases through the spectrum of eigenvalues. 
In fact, E x increases(jumps) by an amount Pk when x reaches the eigenvalue 
bk- 

For each x let us define dE x = E x - E x _ e where e is positive and small enough 
so that there is no eigenvalue bj such that ( x — e ) < bj < x. This means that dE x 
is not zero only when x is an eigenvalue bk and dE x = Pk for x = bk- 

Let us say this very important stuff in still another way. 

In a unitary space, the family of projections E, belonging to a hermitian operator 

A = Y i CiE i (4.223) 

i 

can be thought of as constituting a probability distribution on a finite sample 
space of operators. We shall take the eigenvalues to be ordered in ascending 
order. For every vector |/) the operator valued probability Ei generates the 
ordinary probability distribution P, = (f\Ei\f), which in quantum mechanics 
will give the probability of the outcome upon measurement of A on a system 
in the state |/} as we shall see. 

In analogy with ordinary probability ideas one can construct a corresponding 
operator valued probability distribution function 

E(a) = £ Ei (4.224) 

— oo 

A formula that one would expect to also be applicable in the case of the infinite- 
dimensional Hilbert space if the operator A corresponds to an observable which 
yields upon measurement a countable set of possible values. From the prop¬ 
erties of the family E, it follows that E(a) is a projection operator, that 
E(a) and E(a') commute, that we have E(a') > E(a) for a' > a and that 
E{- oo) = Oj-E^oo) = I. E(a) is an operator valued ladder function whose jumps 
at the discontinuities are given by A(£Q - A(Q-) = Ei. 

Quantum mechanics will require that we consider operators on the Hilbert space 
that will correspond to observables that yield upon measurement a continuous 
range of possible values. Such operators are associated with operator valued 
probability distribution functions analogous to the continuous distribution func¬ 
tions of ordinary probability, that is, with a family of projections E(a) depend¬ 
ing continuously on the parameter a such that E(o !) > E(a) for a' > a and that 
E(- oo) = 0,.E(oo) = I. In the most general case corresponding to the contin¬ 
uous sample space with points of concentrated probability we would expect 
the family E(a) to have discontinuities at of the form E(fi) - E(f t -) = Ei 
where E t is the operator valued probability concentrated at the point Q. 

A spectral family is a one parameter family of commuting projection operators 
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E(a) depending on the real parameter a that satisfies the following properties: 

1. E(a ) is increasing: E(a') > E(a) for a' > a 

2. E(a) is continuous on the right: E(a') -*■ E{a) for a' -*■ a and a' > a 

3. E(a ) is complete: E(- oo) = 0,E(oo ) = / 

It follows that E(a)E(a r ) = E(a')E(a) = £^(a) for a' > a. 

The tentative conclusions of this intuitive discussion can, in fact, be proven to 
be correct. We now formulate them precisely. 

In place of 


m m 



E1 bk) (b k \ = E h = i 

k= 1 k= 1 

(4.225) 

we can now formally write 

r~ oo 

/ dE x = / 

(4.226) 

and in place of 




B=f,bj h .) (bj\ 

j =i 

(4.227) 

we can now formally write 




B = f x dE x 

J — OO 

(4.228) 

Additionally, we have 



r~ oo ^ 

(a 1/3} = / d{a\E a 

J — oo 

! 1/3} (a|B|/3}= f°° xd(a\E x \/3) 

J — oo 

(4.229) 


We can easily show the validity of these expressions in the case of a finite¬ 
dimensional space with discrete eigenvalues. In this case, we can satisfy all of 
the properties of E x by writing 

E x = Y, p kO{x-b k ) (4.230) 

k 

where 

I : 1 * x>b : ( 4 . 231 ) 

10 if x < Ofc 

This is called the Heaviside step function. 

Then 

dE x = d6{x - b k ) = Yj P k -^-0(x-b k )dx = Y p kK x ~ b k) dx (4.232) 

k k dx k 

where 

S(x-c)= Dirac S - function =0 if x + c (4.233) 
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and satisfies 


S' oo 

/ S(x-c) 

J — oo 


dx = 1 


In addition, we have used 

/oo 

g(x)6(x - c) dx = g(c ) 

oo 

In the finite-dimensional case, we then have 


/: 


dE x = 


A = / 


f Y p kS(x-b k )dx = Y 

fc fc 

X OO ^ A' OO _ ^ ^ 

x dE x = / X Y, Pk5(x - b k ) dx = Y, b k p k 

°° -/-OO £ £ 

/ OO ^ /~ 00 ^ 

d(a| E x \/3) = / E (a| A |/3} <5(a; - b k ) dx 

00 J —00 1 


= P 


^(a|P fc |/3> = H(EA.)|/3) = («l/3> 

fc V a- / 


(4.234) 

(4.235) 

(4.236) 

(4.237) 

(4.238) 


X oo ^ r 00 _ ^ 

xd(a| A^I/3) = / a-E( a lA|/3)<5(x-& fe )dx (4.239) 

00 J —00 

= E h <a| A |/3> = (a|(E &fc a) 1/3} = (a| B |/3) 

fc \ fc / 

where we have used the fact that (ck| £7^ |/?) is a complex function of x which 
jumps in value by the amount (a| A 1/3} at x = b k - 

So, we see that result 

E x = Y P kO<,x-bk) (4.240) 

k 

works in the finite-dimensional case!!! 

Unitary operators can be handled in the same manner. We label the eigenvalues 
(absolute value = 1) of the unitary operator U by u k = e l6k with the 0-values 
labeled in order 

0 < 0i < 0 2 < • • • < 0m -1 < 0m, 2 tt (4.241) 

As before we define 

A= E A= EWW ( 4 - 242 ) 

0j<x 0j<x 

This operator now projects onto the subspace spanned by all eigenvectors for 
eigenvalues u k = e l6k with 0 k < x. If x < 0, then E x - 0. If x > 27r, then E x = I. 
E x increments by P k (the same as for a Hermitian operator) at the eigenvalues 
u k = e Wk . 
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We can then write 


m m ^ 2 tt 

U = I wA- = I e^P k -* / e ix dE x (4.243) 

-/o 

and 

(a| U |/3) = / e fa d(a| £ x |/3) (4.244) 

Jo 

Summarizing, for every Hermitian operator H there corresponds a unique spec¬ 
tral family En(a) that commutes with H such that for every |/) and every | g) 
in the domain of H 


/ OO 

ad(f\E H (a) | 5 ) 

OO 


(4.245) 


where the integral is a Riemann-Stieltjes integral. This expression can be written 
in short as 

77 = [°° adE H (a) (4.246) 

J — oo 

and is called the spectral resolution of H. We have 

E H (a) = f a dE H (a') (4.247) 

J — oo 

and therefore 

7 = [°° dE H (a) (4.248) 

J — oo 

Now to every interval A = (oq, 02 ) of the real line, there corresponds a projection 
operator 

77/[A] = / dE H (a) = Eh{c* 2 ) ~ E H (an) (4.249) 

~/cKi 

This is a definition that can be extended to a set of intervals. It follows from 
the properties of the spectral family En(a) that 

4[A]£ fl [A'] = 4[AnA'] (4.250) 

and 

E h [A\ + E h [A , ] = Eh[AuA'] (4.251) 

The quantity dEn(a) can be thought of as the projection Eh (da) corresponding 
to an infinitesimal interval of length da centered at a. 


By definition of the Stieltjes integral one has 

77= lim V ajE H [ A,] (4.252) 

n ^°°A 3 zS n 


where {S n } is a sequence of subdivisions of the real line, such that S n becomes 
infinitely fine as n -* 00 . The sum runs over all the intervals A^ = aj n ^ - a^™\ 


284 



of the subdivision S n . If the spectral family Ek(oi) belonging to a Hermitian 
operator K is constant except for a countable number of isolated discontinuities 
at the points a,; of size Ek^cx-A-Ek (a*-) = Ei, then K has a spectral resolution 
entirely similar to the one in unitary space 

K=Y j a l E i (4.253) 

i 

although in this case the number of terms in the sum may be infinite. 

A general operator H can be regarded as the limit of a sequence 

H n = £ djE H [Aj] (4.254) 

A jtSn 

of operators of type K. 

Let A and B be two operators with spectral families Ea(cc),Eb(ci)- Suppose 
that A and B commute in the sense that [A, B] \g) = 0 for every | g) in the 
domain of [A,B], then one can show that 

[E A (a),E B (a)]\g)=0 (4.255) 

However, unless the domain of [A, B] coincides with the whole space, this rela¬ 
tion does not imply that the spectral families commute. For this reason, when 
we deal with two operators A and B that possess a spectral family we shall use 
a stronger definition of commutativity: 

A and B commute if their spectral families commute. 


4.18. Spectral Decomposition - More Details 

It turns out that for infinite-dimensional vector spaces there exist Hermitian 
and unitary operators that have no eigenvectors and eigenvalues. 

Consider the eigenvalue equation 

x) = Dip(x) = /3ip( x ) 

ax 

This is a differential equation whose solution is 

tp(x) = ce i/3x c = constant 

Suppose the operator 

D = -i-j- (4.258) 

ax 

is defined on the interval a < x < b. Then its adjoint operator tA is defined by 
the relation 

{<P>\b^\^) = {^\b\cf>Y (4.259) 


(4.256) 

(4.257) 
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or for function spaces 



(x)D^if(x) dx 




(j>*(x)Dtp(x) dx + i \ip(x)(f>* (x)\ \ b a 


(4.260) 


where the last step follows from an integration by parts. If boundary conditions 
are imposed so that the last term(called the surface term) vanishes, then D will 
be a Hermitian operator; otherwise it is not Hermitian. 


Now let us try to find the eigenvectors (eigenfunctions in this case) within a 
particular vector space. It turns out that we can define several different vector 
spaces depending on the boundary conditions that we impose. 


Case 1: No boundary conditions 

In this case, all complex f} are eigenvalues and D is not Hermitian. In quantum 
mechanics we will be interested in Hermitian operators, so we are not really 
interested in this case. 


Case 2: a = -oo, b = +oo with |-0(a;)| bounded as |x| -»■ oo 


All real values of /3 are eigenvalues. The eigenfunctions if(x) are not normaliz¬ 
able since 




\ip(x)\ 2 dx = |c| 


i: 


dx = oo 


(4.261) 


They do, however, form a complete set in the sense that an arbitrary function 
can be represented as a the Fourier integral 


<?(*) = r FU3)e** 

J — oo 


d/3 


(4.262) 


which may be regarded as a continuous linear combination of the eigenfunctions. 
In this case, F(/3) is the Fourier transform of q(x). 

Case 3: a = -jf, b= +j with periodic boundary conditions = "0(f) 

The eigenvalues form a discrete set, /?„, satisfying 

e -i^ f = e idn f e i/3„i = (4.263) 

which implies 

y.jT'jr 

/3 n L = 2nir -»/?„= —— (4.264) 

1J 

where n = integers such that -oo < n < oo. These eigenfunctions form a complete 
orthonormal set (normalize by choosing the constant c appropriately) and D is 
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Hermitian. The completeness of the eigenfunction set follows from the theory 
of Fourier series. 

Case 4: a = -oo, b = +oo with if(x) -*• 0 as |x| -> oo 

In this case, the operator D is Hermitian (the surface term vanishes), but it has 
no eigenfunctions within the space. 

So, a Hermitian operator on an infinite-dimensional space may or may not 
possess a complete set of eigenfunctions, depending on the precise nature of 
the operator and the vector space. 

It turns out, however, that the decomposition into projection operators can be 
reformulated in a way that does not rely on the existence of eigenfunctions. 
This alternative formulation uses the integral form of the projection operators 
derived earlier. 

We need, however, to remind ourselves of some ideas we stated above(since it 
never hurts to repeat important stuff). 

Let E\ and E 2 be projection operators onto subspaces Mi and M 2 , respectively. 
We say that Ei and E 2 are orthogonal if Mi and M 2 are orthogonal (every vector 
in Mi is orthogonal to every vector in M 2 ). We can express this orthogonality, 
in general, using the relation 

EjE k = S jk E k (4.265) 

If Mi is contained in M 2 , we write Ei < E 2 . This means that either E-\ E 2 = Ei 
or E 2 Ei = Ei. If Ei < E 2 , then ||i?i |a) || < \E 2 |a) || for any vector |q). If EiE 2 = 
E 2 Ei, then EiE 2 is a projection operator that projects onto the subspace which 
is the intersection of Mi and M 2 , that is, the set of all vectors that are in both 
Mi and M 2 . 


If Ei and E 2 are orthogonal, then Ei+E 2 is the projection operator onto Mi®M 2 
(the direct sum). 

If Ei < E 2 , Ei - E 2 is the projection operator onto the subspace which is the 
orthogonal complement of Mi in M 2 , that is, the set of vectors in M 2 which are 
orthogonal to every vector in Mi. 

Definition: A family of projection operators E x depending on a real parameter 
a: is a spectral family if it has the following properties: 

1. if x < y then E x < E y or E x E y = E x = E y E x - this means that E x projects 
onto the subspace corresponding to eigenvalues < x. 

2. if e is positive, then E x+e \ 77 ) -»■ E x |?;) as e ->• 0 for any vector | g) and any 
x. 
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3. E x 1 77 } -*■ |0) (the null vector) as x -*■ -00 and E x \ij) -*■ |?y) as a; -* 00 for 
any vector \rj). 

For each self-adjoint operator B there is a unique spectral family of projection 
operators E x such that 


X oo ^ 

xd(i\E x \rj) 

00 

for all vectors |?;) and |y). We then write 



(4.266) 


(4.267) 


This is called the spectral decomposition or spectral resolution of B. 

The same results hold for unitary operators. For each unitary operator U there 
is a unique spectral family of projection operators E x such that E x = 0 for x < 0 
and E x - 1 for x > 2n and 

A r 277 

(tI UVf) - / e ix d^\E x \ v ) (4.268) 

JO 

for all vectors \rj) and I 7 ). We then write 

^ r ■^ 7r ^ 

J7= / e ix dE x (4.269) 

Jo 

This is called the spectral decomposition or spectral resolution of U. 

Both of these results generalize for functions of an operator, i.e., 

/ oo 

g(x) dE x (4.270) 

OO 


We considered the case of a discrete spectrum of eigenvalues earlier and found 
that when the operator B has the eigenvalue equation 

B\b k ) = b k \b k ) (4.271) 

we then have 

4 = E P*0(x - b k ) = £ | b k ) (b k 1 0(x - b k ) (4.272) 

k k 

so that 

dE x = Yj I bk) (bk\5{x - b k ) dx (4.273) 

k 

which implies that the only contributions to the integral occur at the eigenval¬ 
ues b k . 


We can state all of this stuff even more formally once again. 
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The Eigenvalue Problem and the Spectrum 

Based on our previous discussion, we can now say that the eigenvalue problem 
H\f) = A |/) has solutions if and only if the spectral family Eh(oc) is discon¬ 
tinuous. The eigenvalues A; are the points of discontinuity of Eh{o). The 
eigenprojection Ei belonging to A* is the discontinuity of Eh (a) at A,;: 


Ei = E h {\) - E H (X i ~) (4.274) 

It follows that the eigenprojections are orthogonal and that (for hermitian H) 
the eigenvalues are real. 

The spectrum of H consists of the set E of points of the real axis where the 
spectral family Eh (a) is increasing. If the spectral family is constant except 
for a (necessarily countable) number of discontinuities H is said to have a pure 
discrete spectrum. If the spectral family is everywhere continuous H is said to 
have a pure continuous spectrum. Otherwise the spectrum is said to be mixed. 
If and only if H is bounded (positive) is its spectrum bounded (positive). Since 
the points of constancy of Eh{ol) do not contribute to the integral in 

X oo 

adE H {a) (4.275) 

oo 

the region of integration can be restricted to the spectrum and we can write 

H = J^adE H (a) (4.276) 

Suppose that an operator H has a mixed spectrum E. The subspace M spanned 
by its eigenvectors reduces H so that we can write H = Hd © H c where Hd and 
H c are operators on M and M 1 , respectively. The spectrum E^ of Hd is pure 
discrete and is called the discrete spectrum of H while the spectrum E c of 
H c is pure continuous and is called the continuous spectrum of H. Note that 
E = Tid u E c , but that E^ and E c may have points in common, that is, there 
may be eigenvalues embedded in the continuous spectrum. Separating out the 
discrete spectrum, the spectral resolution of H can be written 

H = Y^ a iEi+ f adEn{oi) , a, 6 Ej (4.277) 

i 

Clearly, the terms on the RHS are, respectively Hd and H c , and when regarded 
as operators on the vector space. 

Now let us consider the case of a continuous spectrum of eigenvalues. In partic¬ 
ular, consider the multiplicative operator Q defined on L 2 (- oo,oo) by 

Qg(x) = xg(x) (4.278) 
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for all functions g(x ) in L 2 (- 00 , 00 ). This is a Hermitian operator since 


J~ (f>* (x)Q^ip(x) dx = I ip*{x)Q(p{x) dxj = | J~ ip*(x)x<p(x) dxj 

= f <p* (x)xip(x) dx = f cp* (x)Qip(x) dx (4.279) 

-/a ~/a 


Now suppose that Q has an eigenvalue equation with eigenfunctions <j(x) in 


L 2 (- 00 , 00 ) of the form 

Q<z(x) = /3g(x) (4.280) 

Since all functions in L 2 (- 00 , 00 ), including q(x), must also satisfy Qg(x ) = 
xg{x), we then have 

Qq(x) = xq(x) = p)q(x) or (x - p))q(x) = 0 for all x (4.281) 

The formal solution to this equation is the Dirac 5-function 

q(x) = 5(x-/3) (4.282) 

The spectral theorem still applies to this operator. The projection operators for 
Q, in this case, are given by 

Epg(x) = 9(/3 - x)g(x) (4.283) 

This is equal to g(x) if x < [3 and is 0 for x > pi. We then have 

Q = [°° PdEp (4.284) 

which can be easily verified by 

X oo n 00 

PdEpg(x) = / /3d[6(/3 - x)g(x)] (4.285) 

00 J — 00 


d r 00 

/ /?<5(/3-a05(x)]d/3 = xg(x) 

00 cLfj J —00 

So the decomposition into projection operators can still be defined in the case 
of a continuous spectrum. 


Saying it still another way. Let Eq(x ) be its spectral family and let us put 
( Eq{o) f){x ) = g(a,x). The spectral resolution of Q requires that 

xf(x) = (Qf)(x) = f adE Q (a)f)(x) = f adg(a,x) (4.286) 


A solution of this equation is obtained if we set dg(a , x) = 5(a - x)f(x) da. We 
then get 


Eq(oi) f)(x) = g(a,x) = f dg(a',x ) (4.287) 

J — 00 

= f S(a r - x)f(x) da' = x( a ~ x )f( x ) 

J — OO 
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where x( a ~ x ) is the Heaviside function 


x( x ) 


f+1 if 0 < x 
jo if 0 > x 


(4.288) 


According to this calculation Eq(o) is the multiplicative operator x( a ~ x ) as 
above. We now need to verify that the solution we have constructed gives the 
unique spectral family belonging to Q. 


First we show that the solution is a spectral family, that is, that it has the 
properties required by the definitions given earlier. Since the Aq(«) are mul¬ 
tiplicative operators, it is clear that they commute. Property (1) is equivalent 
to 

E{a)E{a') = E(a) , a<a (4.289) 

a property that is clearly satisfied by the multiplicative operator x( a ~ x )- Prop¬ 
erty (2) holds because x( x ) was defined to be continuous on the right, while 
properties (3) are clear. 

Next we have to verify that the spectral family found does in fact belong to Q. 
According to the spectral theorem we must have 

(,f\Q\g)= [ ad{f\E Q (a)\g) (4.290) 

J — oo 

or 

X OO S' OO S' oo 

f*(x)xg(x)dx = / dx da f* (x)6(a - x)g(x) (4.291) 

oo J — oo J-oo 

which is clearly true by definition of the (5-function. 

The spectral family Eg(a) is continuous everywhere and increasing in (-oo, oo). 
The spectrum of Q consists therefore of the whole of the real axis and is pure 
continuous. The projection £q[A] is the characteristic function y(A) of the 
interval A, that is, a function of x equal to one if a; € A and zero otherwise. 


Some final thoughts about these ideas 


Projection Operators and Continuous Spectra - An Example 


In the macroscopic world, if we want to locate the position of an object, we use a 
calibrated ruler. Formally, the physical position x is a continuous variable. The 
ruler, however, only has a finite resolution. An outcome anywhere within the 
j th interval is said to correspond to the value Xj. Thus, effectively, the result of 
the position measurement is not the original continuous variable x, but rather 
a staircase function, 

x' = f(x) = Xj , V Xj < x < Xj+ 1 (4.292) 
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Figure 4.1: Staircase Function 


as illustrated in Figure 4.1 above. 

These considerations are easily translated in quantum language. 

In the ^-representation, an operator x' is defined as multiplication by the stair¬ 
case function f(x). This operator has a finite number of discrete eigenvalues 
xj. Each one of the eigenvalues is infinitely degenerate, that is, any state vector 
with domain between Xj and Xj+i falls entirely within the j th interval of the 
ruler (see Figure 4.1), and therefore corresponds to the degenerate eigenvalue 

Xj. 

Orthogonal resolution of the Identity 

An experimental setup for a quantum test described by the above formalism 
could have, at its final stage, an array of closely packed detectors, labeled by 
the real numbers Xj. Such a quantum test thus asks, simultaneously, a set of 
questions 


"Is Xj < X < Xj+i I" 


(4.293) 


(one question for each j). The answers, yes and no, can be give numerical 
values 0 and 1, respectively. Each of these questions therefore corresponds to 
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the operator Pj , which is itself a function of x: 


Pj(x) 


1 if Xj < x < Xj+i 

0 otherwise 


(4.294) 


Clearly, these operators satisfy 

P 3 P k = 5 jk P k and ZPj=I (4.295) 

3 

This implies that they are projection operators (or projectors) and the questions 
will correspond to the measurement of the projection operators. 

The staircase function x' = /( x) defined above can then be written as 

x' = Y, s jP (4.296) 

3 

This operator x' approximates the operator x as well as is allowed by the finite 
resolution of the ruler. 

How do we proceed to the continuum limit? Let us define a spectral family of 
operators 

E{x3)=T.Pk (4.297) 

k=0 

They obey the recursion relations 

E(xj+ 1) = E(xj ) + Pj (4.298) 

and the boundary conditions 

E(x m in) = 0 , E(x max ) = J (4.299) 

The physical meaning of the operator E(xj) is the question 

"Is x < Xj ?" (4.300) 

with answers yes = 1 and no = 0 . 

We then have 


j -1 m -1 j -1 m -1 

E(Xj)E(x m ) =Y,PkY,Pn=Y,Y, PkPn 

k= 0 n= 0 k= 0 n=0 

- V p if Xj<Xm 

/ j / j Okn-Ln 1 A/ \ -r 

k= 0 n= 0 I EyXm) II %m — 


(4.301) 


so that the E(xj) are projectors. 
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We can now pass to the continuum limit. We define E(f) as the projector which 
represents the question 

"is x<f ?" (4.302) 

and which returns, as the answer, a numerical value (yes = 1, no = 0). We 
then consider two neighboring values, £ and £ + df , and define an infinitesimal 
projector, 

dE(0 = E(£ + dZ)-E(Q (4.303) 

which represents the question ”Is £ < x < £ + df ?”. This dE(f) thus behaves 
as an infinitesimal increment P :j in the equation 


E{ x j+1 ) = E(xj) + Pj (4.304) 

We then have, instead of the staircase approximation, the exact result 

x = [\dE(n (4.305) 

Jo 

Note that the integration limits are actually operators, namely, E(x m i n ) = 0 
and E(xmax) = 1- 

This equation is the spectral decomposition or spectral resolution of the operator 
x and the operators If (£) are the spectral family (or resolution of the identity) 
generated by x. We can now define any function of the operator x 

/(*)= rnQdEte) (4-306) 

Jo 

Note that the right hand sides of the last two equations are Stieltjes integrals. 

If we consider a small increment df -*• 0, then the limit dE(£)/dt; exists and the 
integration step can be taken as the c-number df rather than dP(£), which is 
an operator. We then have an operator valued Riemann integral 

[ 1 M)dE(0= f Xmai f (4.307) 

Jo Jx min df 

This type of spectral decomposition applies not only to operators with contin¬ 
uous spectra, but also to those having discrete spectra, or even mixed spectra. 

For a discrete spectrum, dE(f) = 0 if £ lies between consecutive eigenvalues, and 
dE(f) = l\, that is, the projector on the k th eigenstate, if the k th eigenvalue 
lies between £ and f + df. 

The projector E(£f) is a bounded operator, which depends on the parameter f. 
It may be a discontinuous function of f, but it never is infinite, and we never 
actually need dE(f)/df. This is the advantage of the Stieltjes integral over the 
more familiar Riemann integral: the left-hand side of the last equation is always 
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meaningful, even if the the right-hand side is not. 

Some Useful Properties 

If /(£) is a real function, then f(x) is given by 

/(*)= f 1 f(0dE(0 (4-308) 

Jo 

is a self-adjoint operator. 

We also have 

/ f(0 dE(0 f g(v)dE(v) = f f(0 9(0 dE(0 (4-309) 

e i/(x) = J e im) dE(0 (4.310) 

The spectral decomposition of a self-adjoint operator will allow us to give a rigor¬ 
ous definition of the measurement of a continuous variable. It will be equivalent 
to an infinite set of yes-no questions where each question is represented by a 
bounded(but infinitely degenerate) projection operator. 

Looking ahead to quantum mechanics . 

An operator such as Q that has a continuous spectrum is said to have a formal 
eigenvalue equation in Dirac language 

Q\q) = q\q) (4.311) 

In the development of the theory, we will make assumptions that will lead to 
the orthonormality condition for the continuous case taking the form 

(q'\q") = 5(q'~q") (4.312) 

Since this implies that (q\q) = oo, these formal eigenvectors have infinite norm. 
Thus, the Dirac formulation that we will construct for operators with a contin¬ 
uous spectrum will not fit into the mathematical theory of Hilbert space, which 
admits only vectors of finite norm. 

Operators will take the form 

/ oo 

q\q)(q\dq ( 4 . 313 ) 

00 

which is the continuous analog of our earlier expressions. 

The projection operator will be formally given by 

Ep = \q)(q\dq (4.314) 

J — oo 
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It is well-defined in Hilbert space, but its derivative 


dEp 

dq 


k>(«l 


does not exist within the Hilbert space framework. 


(4.315) 


There are two alternative methods for making quantum mechanics fit within a 
mathematically rigorous Hilbert space framework. The first would be to restrict 
or revise the formalism to make it fit (still not admit states of infinite norm). 
The second would be to extend the Hilbert space so that vectors of infinite norm 
are allowed. We will discuss these ideas later and make appropriate choices. 


We saw earlier in the finite-dimensional case, where we have a discrete spectrum 
of eigenvalues, that the value of the projection operator E x for the operator B 
jumped by Pk = \bk) {bk\ as x passed through the k th eigenvalue bk- 

For the infinite dimensional case, where we can have both a discrete and a 
continuous spectrum of eigenvalues, the projection operator behaves in the same 
way as we move about the discrete part of the spectrum. In the continuous part 
of the spectrum, however, it is possible for the projection operator to exhibit a 
continuous increase in value. 


In a more formal manner we state: if B is a self-adjoint operator and E x is the 
projection operator of its spectral decomposition, then 

1. the set of points x on which E x increases is called the spectrum of B\ 
alternatively, a point is in the spectrum if it is not in an interval on which 
E x is constant. 

2. the set of points x on which E x jumps is called the point spectrum of B\ 
the point spectrum is the set of all eigenvalues. 

3. the set of points x on which E x increases continuously is called the con¬ 
tinuous spectrum of B. 

4. the point spectrum and the continuous spectrum comprise the total spec¬ 
trum. 

In quantum mechanics, as we shall see, a real physical, measurable quantity will 
be represented by a self-adjoint operator. The spectrum of the operator will be 
the set of real numbers that correspond to the possible measured values of the 
physical quantity. The projection operators in the spectral decomposition will 
be used to describe the probability distributions of these values and the state 
operators. We will get discrete probability distributions over the point spectrum 
and continuous probability distributions over the continuous spectrum. 

All of these new mathematical quantities will have direct and important physical 
meaning! 
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4.19. Functions of Operators (general case); Stone’s 
Theorem 

We mentioned earlier that functions of operators also have a decomposition into 
projection operators. In fact, we wrote 

/ oo 

9 (x)dE x (4.316) 

oo 

The ability to deal with functions of operators will be very important in quan¬ 
tum mechanics, so let us spend more time looking at the mathematical details 
involved. 


For any self-adjoint operator we have the spectral decomposition 


B = 



xdE x 


(4.317) 


If f(x) is a complex function of a real variable x, then we can define the same 
function of an operator by 

X OO A 

f(x)d(a\E x \fi) (4.318) 

oo 


for all vectors |a) and \/3). Now a self-adjoint operator is bounded if and only if 
its spectrum is bounded. If f(B) is a bounded operator on the spectrum of B, 
then it turns out that the above equation for all |a) and |/3) defines f(B) I 77 ) for 
all vectors \rj). 


Is the above definition reasonable? We can see that it is by looking at some 
properties and some simple examples. 


Let f(x) = x. This implies that f(B) = B since we must have 


If we let 


we then have 


X oo ^ 

xd(a\E x \/3) 

OO 

<a|/|/?}= f°° d(a\E x \f3) = (a\(3) 

(f + 9 ){x) = f(x) + g(x) 

( cf)(x ) = cf(x) 

(f + g)(B) = f(B)+g(B) 
(cf)(B) = cf(B) 


(4.319) 

(4.320) 

(4.321) 

(4.322) 

(4.323) 

(4.324) 


so that sums and multiples of functions of operators are defined in a standard 
way. 
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If we let 

U 9 )(x) = f(x)g(x) 

we then have 


(a| f(B)g(B) |/?) = [°° f(x) d (a| E x g(B) |/ 3 ) 

J — oo 

/ oo r~ oo ^ 

f{x)d x / g(y)dy(a\E x E y \l3) 

oo J — oo 

= f f{x) d f giy)d(a\Ey\P) 

J — oo J — oo 

X OO A 

f(x)g(x)d(a\E x \ 0 ) 

oo 

= [ ifg)ix)d(a\E x \/ 3 ) = (a\ifg)iB)\ 

J — oo 


(4.325) 

(4.326) 


so that 

ifg)iB) = fiB)giB) (4.327) 

as it should. We can also show that /(-B)g(I?) = giB)fiB) so that all functions 
of the operator B commute. 


With these properties, we can then define a polynomial function of an operator 

by 

fix) = Co + cix + C 2 X 2 + ... + c n x n (4.328) 

f(B) = c 0 + ci B + c 2 B 2 + ... + c n B n (4.329) 

for any vectors |a) and |/3). 


Thus, products of functions of operators are also defined in the standard way. 
If we let if*)ix) = f*ix), then for any vectors |a) and \/3) we have 

(a\[fiB)]*\f3) = (l3\fiB)\a)* (4.330) 

X oo ^ r~ 00 ^ 

f*(x)d(a\E x \P) = / if*)ix)d(a\E x \p) 

00 J —00 


or 


[/(£)] = (t)(B) (4.331) 

If fix) is a real function, this implies that fiB) is also a self-adjoint operator. 


If f*f ~ 1) then fiB) is a unitary operator since 

[/(B)] t /(B) = /=/(B)[/(B)] t 


(4.332) 


Now for any vector |a) we have 

(a\f(B) |a) = [ f(x)d(a\E x \a) (4.333) 

J — OO 

/ oo ^ n 00 ^ 

fix)d{a\E x E x |a) = / f(x)d\\E x \a)\\ 2 

00 J — 00 


298 



If we define a self-adjoint operator to be positive if and only if it spectrum is 
non-negative, then f(B ) is positive if f(x) is non-negative over the spectrum 
of B and f(B ) is bounded if |/(x)| is bounded over the spectrum of B. 


In the special case where B has only a point spectrum, we have 

B = Y l hPk 

k 

and then for any vectors |a) and \(3) 

H f(B) 1 / 3 } = (a\ / (e fefcPfc) |/?} = E (a\ f{b k P k ) |/?> 

= E/(MHA|/3} = HE/(MAl 

k k 


(4.334) 


(4.335) 


or 

f{B) = Y J f{b k )Pk 

k 

as we expected. 

We define the same properties for unitary operator. Let 


(4.336) 


U = 


r 


: dE T 


and also let 

(a\B\ 

for all vectors in the space. 


J f' 27r 

xd(a\E x 

o 


(4.337) 

(4.338) 


This defines a bounded self-adjoint operator B with spectral decomposition 

^ r2 7T 

B= xdE x (4.339) 

Jo 

We then have 

r ■^ 7r ^ r 2tt ^ 

(a\U\p)= / e ix d(a\E x \{3) = / /(*) d(a\ E x |/3) = <a| f(B) |/3> (4.340) 

Jo Jo 

which implies that U, in this case, must be a particular function of B , namely 
U = e lB . In addition, any function of U is clearly a function of B. 


There exists another relation between Hermitian and unitary operators that is a 
fundamental property in quantum mechanics. Let H be a self-adjoint operator 
with the spectral decomposition 

H = f°° xdE x (4.341) 

J — oo 
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For every real number t let 


J r 

e itx d(a\E x \(3) (4.342) 

o 

This then defines an operator U t = e ltH which is unitary since ( e ltx y e itx ~ \. 
We also have Uq = I. 

Now since 

AtxAt'x A(t+t')x 

e e = e v ' 

we must have 

UtUtf = Ut + t' 

for all real numbers t and t'. 

The converse of this property is called Stone’s theorem: 

For each real number t let U t be a unitary operator. Assume that (a| U t |/3) is a 
continuous function of t for all vectors ]a) and |/3). If Uo = I and U t U t > = U t+f / 
for all real numbers t and t', then there is a unique self-adjoint operator H such 
that U t = e ltH for all t. A vector \(3) is in the domain of H if and only if the 
vectors 

-(Ut-im (4.345) 

it 

converge to a limit as t -*• 0. The limit vector is H\f3). If a bounded operator 
commutes with Ut, then it commutes with H. 

This theorem leads to the following very important result. 

If U t |/3) is in the domain of H, then 

T^(f/At - I)U t 1/3) -*■ HU t |/3) (4.346) 

or 

^(UAtUt - Ut)\A = T^(^t + At - Ut) 1/3) - -ijUt 1/3} (4.347) 

as At -»■ 0. We can then write 

-i*-Ut\(3) = HU t \(3) (4.348) 

at 

This equation will eventually tell us how the physical states(ket vectors) and 
state operators that will represent physical systems evolve in time (it will lead 
to the Schrodinger equation, which is the time evolution equation in one repre¬ 
sentation of quantum mechanics). 


(4.343) 

(4.344) 
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Examples - Functions of Operators 

Suppose that we have the eigenvector/eigenvalue equations for a self-adjoint 
operator 

A\k)=a k \k) , k = 1,2,... ,N (4.349) 

We then assume that 

f(A)\k) = f(a k )\k) , k =1,2,..., N (4.350) 

for the eigenvectors. 

We can show that this works for polynomials and power series as follows: 

N 

M=El fc >W> (4-351) 

k=1 

N N 

A\ip) = A E \k){k\tp) = Y'A\k) (k\ip) 

k =1 k =1 

N / N \ 

= E a fc l fc ) ( k I ^) = ( E a k l fc ) ( k\ | ip) (4.352) 

k =1 U=1 / 

JV 

-*■ A = E a k \k) (k\ -+ spectral resolution of the operator (4.353) 

fe=l 

Now define the projection operator 

P fe = \k) (k\ - = PkS kj (4.354) 

We then have 

n N 

A = E \ k ) (k\ = E a kPk (4.355) 

k =1 k=1 

or any operator is represented by a sum over its eigenvalues and corresponding 
projection operators. 

We then have 

( N \ / N \ N 

E a k Pk E a jPj = E akCLjPkPj (4.356) 

k =1 / \j=l / fc,j=l 

N N N 

= £ a k ajP k S k j = £ a\P k - i" = E 

/c, j=1 fc=l fc=l 

Therefore, for 

N 

f(x) = E (4.357) 

n=l 
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we have 


N N N 

f(A) = £ g n i" = 2 5n 2 alP k 


n= 1 


n=l k =1 


N / N \ 

££*.“2 

fc=l \n=l / 


AT 


Pk = E. f(°k)Pk 

k= 1 


This says that, in general, we have 


AT AT 

/(i) |V>) = /(i) E |fc> (fc I v-) = E /(^) \k) (k I 0> 

fe=l k =1 

AT / AT \ 

= E /(°fc) l fc ) A I ( / , > = E /AA l fc ) ( fc l A) 

fc=i \fc=i / 


AT 


-*■ /(A) = E f( a k) n \k) (fc| -»■ spectral resolution of the operator 
k= 1 

Numerical example: consider the operator 


A = 


(!!) 


which has eigenvalues 7 and 1 with eigenvectors 




|i) = V2 


-l) 


This gives 


P 7 = |7) <7| = - 


0:) 


A = | 1 >( 1|=2 


(-1A 


and therefore 


7/l 1 

A-7P 7 + P 1 = -l 1 1 


(4.358) 


(4.359) 


(4.360) 


A = 7 Z P 7 + Pi = — 


49 /l 1 


H(-'. :) 

\ l/l -l\_/25 24\ (4 3\/4 3\ 

J + 2\-i 1 )~V A 25 y _ y3 4^3 4 ) 


2 1 1 


log (i) = log (7 )P 7 + log (1) A = | 




^*♦*-£(1 yy iy 2 y\ AD 


Clearly we then have 


log(i)|7) 


log (7) /l lUl\ log (7) / 1 \ 


2\/2 \1 VW n/2 A 


log (7) |7> 
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log (A) 1 1 > 


0 = log (1) |1) 


log(7) /1 l\/l\ log(7) / 0 \ 

2x/2 \1 VV-lj" 2x/2 W 

as expected. 

The big question that remains is then 

"Is /(A) = /(A) ?" (4.361) 


no proof exists! 


4.20. Commuting Operators 

As we stated earlier, the commutator of two operators is given by 

[A,B]=AB-BA (4.362) 

If A and B are self-adjoint operators, each possessing a complete set of eigen¬ 
vectors, and if they commute, then there exists a complete set of vectors which 
are eigenvectors of both A and B, that is, they possess a common set of eigen¬ 
vectors. 

This theorem extends to any number of commuting operators. If we have a 
set of N mutually commuting operators, then they all have a common set of 
eigenvectors. 

The reverse is also true. If two operators possess a common set of eigenvectors, 
then they commute. 

Let {A, B,C ,...} be a set of mutually commuting operators that possess a com¬ 
plete set of common eigenvectors. Corresponding to a particular eigenvalue for 
each operator, there may be more than one eigenvector. If, however, there is no 
more than one eigenvector for each set of eigenvalues (a*,, bk, Ck, . ..), then the 
operators {A, B,C,...} are said to be a complete commuting set of operators. 

Any operator that commutes with all members of a complete commuting set 
must be a function of the operators in that set. 

Let us now think about these ideas in terms of projection operators. 

Let Q be a Hermitian operator with a pure point spectrum so that we can write 

Q = Z ( lkPk (4.363) 

k 

where each qi~ is a different eigenvalue of <5 and Pi- is the projection operator 
onto the subspace corresponding to eigenvalue q 
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Let R be a bounded Hermitian operator that commutes with Q. For each k and 
for any arbitrary vector \rj) we then have 


QRP k | rj) = RQP k \v) = R 



Pk\n) 


= R^qjSjkPk\v) = QkRPk \v) 

i 


(4.364) 


where we have used the relation 


PjPk = < 5 jkPk 


(4.365) 


for any vector \rj). Thus, RPk \q) is an eigenvector of Q with eigenvalue qk- 
Therefore, PkRPk \v) = RPk \q) for all | q) and we have 


RPk = PkRPk 


(4.366) 


Taking the adjoint of both sides we get 

(RPkf = (PkRPk f (4.367) 


RPk = PkRPk 


(4.368) 


Thus, 


RP k = PkR 


(4.369) 


or each Pk commutes with every bounded Hermitian operator which commutes 
with Q. 


We can extend this result to operators possessing both a point and continuous 
spectrum. If Q is a self-adjoint operator with the spectral decomposition 



(4.370) 


and if R is a bounded self-adjoint operator that commutes with Q, then 


E X R = RE X 


(4.371) 


for every x. 

Let us now express the ideas of complete commuting sets in terms of projection 
operators. 

Let {B \ 1 B2, .. •, -B/v) be a set of mutually commuting Hermitian operators with 
pure point spectra. For each we then have 

Br = Y,b£ ) P£ ) , r - 1 , 2,... ,7V (4.372) 

k 
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(r) * * (r) 

where each b, ’ is a different eigenvalue of B r and P k ' is the projection operator 
onto the subspace spanned by the eigenvectors of B r corresponding to b k . 

By definition then, the projection operators P k ' commute with each other for 
all different r and k 

Pj r) Pt 8) = Pt s) Pj r) (4.373) 

This implies that 

PVPj 2) ... P£ N) (4.374) 

is the projection operator for any N, that is, it projects onto the subspace 

of all vectors |a) such that 


Pi \a) = b?> |a) , B 2 |a> = tif 1 \ a) ,B N |a> = b\ N) |a> 


} |a) , B 2 |a) = bf ] | 

These projection operators are mutually orthogonal 

jf } PJ 2) ... pWpWpW ... p(, N) = Su'Sjj '... Su'PVPjV ... P/ 


and they have the completeness property that 

i j l 

Note that some of these projection operators might be zero. 


(4.375) 

(4.376) 

(4.377) 


Suppose that none of them projects onto a subspace of dimension larger than 
one. In this case, we say that the set of operators {Bi 1 B 2 , ..., B^} is a complete 
set of commuting operators. 

Now let us return to the study of a continuous spectrum. 

First, we repeat some earlier material to set the stage. 


Let us start with a very simple example so we can figure out how to proceed and 
then generalize to a more complicated case. We consider the space L 2 (- 00 , 00 ) 
and a single Hermitian operator Q defined by 

(Qg)(x) = xg(x) (4.378) 

It turns out that every bounded operator which commutes with Q is a function 
of Q. 


Now, there exists a theorem: Suppose we have a set of mutually commuting op¬ 
erators {Aj}. This is a complete set of commuting operators if an only if every 
bounded operator which commutes with all {A;} is a function of the {A,}. 

In the previous discussion, we had the case of a complete commuting set con¬ 
sisting of a single operator. 
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The spectrum of Q is purely continuous and consists of all real numbers x. Each 
vector g is a function g(x) on the spectrum of Q. 

We connect this to the case of a complete set of commuting operators with a 
pure point spectra as follows. 

We define two abstract vectors \x) and | g) such that 



{x\g) = g(x) 

(4.379) 

We then have 

( Qg) (x) = (x\Qg)=x (x | g) = xg(x) 

(4.380) 

which is the spectral representation of Q . We can generalize to a function of Q 

with 

{x f(Q)g) = f(x) (x | g} = f{x)g{x) 

(4.381) 

In Dirac language, 

we write an abstract equation like 



Q\x) = x |a;} 

(4.382) 

We then have 

x* (x| = x (x| = (x| = (x| Q 

(4.383) 

and 

(x\Q\g) = x{x\g) = xg(x) 

(4.384) 


which again gives the spectral representation of Q. 


Finally we have 

(x\ f(Q ) | g) = /( x) (x | g) = f(x)g(x) (4.385) 

The problem with defining an abstract Hermitian operator Q by 

Q\x) = x |a:} (4.386) 

is that Q has no eigenvectors in the Hilbert space L 2 of square-integrable func¬ 
tions. In order for there to be eigenvectors we must have 

(Qg)(x) = Qg{x) = xg( x) = ag{ x) (4.387) 

for a real number a. This implies that g(x) is zero for all points x + a and 




M 2 = o 


(4.388) 


because the standard integral is not changed by the value of the integrand at a 
single point. 


Now we have 

(x\ Q |a) = x (x | a) = a (x \ a) (4.389) 
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If we replace the inner product by 


(x | a) = (a | x) = S(x - a) 

(4.390) 

or, in general, 


(x | x') - (x 1 1 x) = S(x - x') 

(4.391) 

We then have for each real number a 


xS(x - a) = a6(x - a) 

(4.392) 


which is a valid mathematical relation for delta functions. 

Thus, we can formally use Dirac delta functions for eigenfunctions of Q as 
follows: for each real number a we have 

xS(x - a) = aS(x - a) (4.393) 

If we write | a) for S(x - a), we then have 

Q|a) = a|a) (4.394) 

We must not consider <5(;r- a) as a standard integrable function and we cannot 
think of | a) as a vector in the Hilbert space L 2 . We must do all mathematics 
using the standard delta function rules. 


In this way we have 


X oo 

S(x-a)g(x) dx 

oo 


(4.395) 


as the components of a vector | g) in the spectral representation of <5- Note the 
shift from a function g(x) to the ket vector \g) and the relationship between the 
two mathematical objects. In fact, we can write 

g(x) = f 6(x-a)g(a)da = {x\g) = (x\I\g) (4.396) 

J — OO 

= ( x \(f X l a X a l da )l S> = ( x \ Hs)|a) da 


or 


\g)= f ( a \g)\a) da (4.397) 

J — OO 

In addition, we have (using the properties of projection operators derived earlier) 

(a Is) = i a \I\g) = (a| J\x){x\dx\g) (4.398) 

= (a | x) {x | g) dx = S(x-a)g(x ) dx 

= S(«) 
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as expected. 

Thus, formally, we can think of any eigenfunction g(x) as a linear combination 


of delta functions, where the delta functions are analogous to 
basis of eigenvectors with the symbol \x). We the have 

an orthonormal 

1 g)= / (x\g)\x) dx , /= / |x)(x|dx 

Thus, for each real number a, we define the operator | a) (a| by 

(4.399) 

(|a)(a| 5 )(x) = {x\a){a\g) = g(a)5(x-a) 

(4.400) 

\a){a\g = {a \ g) |a) = g(a) |a) 

(4.401) 

In a similar manner, the projection operator E x in the spectral decomposition 
of Q is given by 

(■ E x g)(y)= / g(a)S(y-a)da 

J — oo 

(4.402) 

which we can write as 

(E x g)(y)= [ ( \a){a\g)(y)da 

J — oo 

(4.403) 

This says that 

E x = a) (a| da 

J — oo 

Finally, we have 

(4.404) 

Q — / x dEx 

J — oo 

which we write as 

(4.405) 

< 5 = / x\x)(x\dx 

J — oo 

(4.406) 


This is analogous to a sum of eigenvalues multiplying projection operators onto 
eigenvector subspaces. We will return to this discussion when we introduce the 
position operator. 


4.21. Another Continuous Spectrum Operator 

We already have the relation 

{x\x') = S(x-x') (4.407) 

Let us introduce a new operator p such that 

p\p) - p\p) > -oo<p<oo , preal (4.408) 

Thus p is an operator with a continuous spectrum just like Q. As we found to 
be true for Q , we can now also write 

P = — ^ f P \p) {p\ dp , I = —^ J \p) (p\ dp (4.409) 
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This allows us to write 


(x'\x) = S(x'-x) = {x'\I\x) =-^f (x'\(\p) (p\ \x) dp 

~ 2jrh, f dp= 

Now, one of the standard representations of the delta function is 

6(x-x') = - i - f d p 

2nh J 

Thus we can write 

J (x'\p){x\p)* dp= J e- ip(x - x ' )/h dp 

One of the most important solutions to this equation is 

(x\p) = e ipx/h ^ (x\p)* = e~ ipx/h , {x'\p) = e ipx/h 

As we shall see later, this choice will correspond to the new operator 
senting the standard linear momentum. 


We can then write 

(x\p\p)=p(x\p)=pe tpx/h = -ih-^-e ipx,h = - ih(x\p) 

(./ \J*L/ 

which, in our earlier notation, says 


In addition we can write 

(p\x\ip) = [{ip\x\p)T = 


(pg)(x) = -ih—g(x) 


\x')(x'\ dx'^\p) 


J (tp\ x \x') (x 1 \p) dx = J~ x'(ip \x') (x 1 \p) dx' 

J~ (ip\x r ) x r (x r \p) dx' = J~ (ip\x r ) j (x'\p) dx' 

J (ip\ x ')(x'\p) dx' = (V’l (y W) (x'\dx' 


dp 

= i^TT [O’] P)T = i h TT {P\i’) 
op op 

d 

=>(p\x = ih— (p | 
dp 


since 1^) is arbitrary. 


(4.410) 
dp 

(4.411) 

(4.412) 

(4.413) 
p repre- 

(4.414) 

(4.415) 


M* 

(4.416) 
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Now, since the eigenvectors of p form a basis, we can write any arbitrary vector 
as 


I a) = f (p\g)\p) dp 

which implies 

(x\g) = J ( p\g) {x\p) dp= J {p\g)e ipx,h dp 

Now the theory of Fourier transforms says that 

9 {x) = f G(p)e ipx ' h dp 


(4.417) 

(4.418) 

(4.419) 


where G(p) is the Fourier transform of g(x). Thus, we find that G(p) = ( p\g) 
is the Fourier transform of g(x). 


More about Fourier Ti’ansforms (in general) 

In the space L 2 of square-integrable functions, let us consider a self-adjoint 
operator defined by the relation we found earlier for the p operator 

(pg)( x) = -ih^-g(x) (4.420) 

ox 

As we already have seen there is a direct connection here to Fourier transforms. 
Let us review some of the mathematical concepts connected with the Fourier 
Transform. 

If g is a function(vector) in L 2 , then 

1 r n 

V’nW = — / e~ lkx g(x)dx (4.421) 

Z7T J-n 

defines a sequence of functions (vectors) if n in L 2 which converges as n -> oo to 
a limit function (vector) Gg such that ||G< 7 | 2 = Igf 2 and 

1 r °° 

1>(k) = ( Gg)(k ) = — / e~ ikx g(x) dx (4.422) 

Zti J— oo 

We also have 

1 r n 

gn(k)=— e ikx (Gg)(k) dk (4.423) 

Z7T J-n 

which defines a sequence of functions (vectors) that converges to g as n -* oo 
where 

g(x)=^f (Gg)(k) eikx dk (4.424) 

Now, this gives 

(pg)(x) =-ih-^g(x) =J (Gg)(k)ke lkx dk (4.425) 
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It is clear from this expression that a vector g is in the domain of p if and only 
if the quantity k(Gg)(k) is square-integrable. We then have 


(Gpg)(k) = ^ f {pg){x)e~ ikx dx = f ^ (Gg)(k')hk'e ik '* 

dk'^j e~ ikx dx 

= J {Gg)(k')hk' dk'^ J e i{k '~ k)x dx 
= — f (Gg)(k , )hk , 6(k r - k) dk' = k(Gg)(k) 

Z7T J 

(4.426) 

We call G the Fourier transform of g. G is a unitary operator on L 2 
is given by 

(G- 1 h)(a;) = — [ h{k)e ikx dk 

2tt J 

. Its inverse 

(4.427) 

which implies 


(G~ l h)(x) = (Gh)(-x) 

(4.428) 

for every h. Since G is unitary, it preserves inner products as well 
vectors so we have 

as lengths of 

J~ (Gh)(k)* (Gg)(k) dk- h*(x)g(x)dx 

(4.429) 

for all vectors h and g. 


In terms of the operator Q defined by 


(Qg)(x) = xg(x) 

(4.430) 

it can be shown that 


Gp = QG 

(4.431) 

or 


p=G“ 1 QG 

(4.432) 

From the spectral decomposition 


Q — I y dEy 

J — oo 

(4.433) 


we can then obtain the spectral decomposition of p . Since G is unitary and the 
set of operators G~ 1 E y G is a spectral family of projection operators, the set of 
operators E y is also a spectral family of projection operators. 


Since G 1 = G\ we have 

(h,pg) = (h,G*QGg) = (Gh,QGg) 

= f°° y d(Gh, E v Gg) = [°° yd{h,G- l E y Gg) 

J — oo J— oo 


(4.434) 
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for any vector h and any vector g in the domain of p. Thus the spectral decom¬ 
position of p is 


P = 



yd{G~ l E y G) 


(4.435) 


Now recall that E y is the projection operator onto the subspace of all vectors 
g such that g(x) = 0 for x > y. Therefore, G~' E y G is the projection operator 
onto the subspace of all vectors g such that ( Gg)(k ) = 0 for k > y. 


This means that p r has the same spectrum as Q r , namely, a purely continuous 
spectrum consisting of all real numbers as we already assumed at the beginning 
of our discussion. These results generalize to functions of the operators. 


We have been thinking of the Fourier transform as an operator G which takes 
a vector g to a different vector Gg. We may also think of g(x) and ( Gg)(k ) 
as two different ways of representing the same vector g as a function. We can 
write 

(k\g) = (Gg)(k) (4.436) 

provided we are careful not to confuse this with 

(x\g) = g(x) (4.437) 

We think of (k\g) as a function on the spectra of p. We then have 

(k\p\g) = k(k\g) (4.438) 

which is the spectral representation of p. 


For a function / of p we have 

(k\f(p)\g) = f(k) (k\g) (4.439) 

The operator p has no eigenvectors (as was true earlier for Q ), It does, however, 
have eigenfunctions which we can use as analogs of eigenvectors as we did earlier 
for Q. 

If we write |fc) for 


1 ikx 


(4.440) 

we have 



p\k) = hk | k) 


(4.441) 

as we assumed at the beginning, since 




1 i \ 

x ikx 1 

^ / 

(4.442) 
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For the components of a vector g n the spectral representation of the operators 
p we have 

{k\g) = (Gg)(k) = — L= J g{x)e ikx dx (4.443) 

We can think of these as the inner products of g with the eigenfunctions e lkx / \/2n. 
We have also 

g(x) = -j= f (Gg)(k)e ikx dk (4.444) 

which we can write as 

| k) = J {k\g)\k)dk (4.445) 

so we can think of any vector g as a linear combination of the eigenfunctions. 

Thus, the eigenfunctions e lkx l\/2i: are analogous to an orthonormal basis of 
eigenvectors. They are not vectors in the Hilbert space L 2 , however, because 
they are not square-integrable. 


We use them in the same way that we earlier used the delta functions for eigen¬ 
functions of the operators Q. In fact, 


1 


Akx 


-j= J 5{k'-k)e ik ' x dk' 


\J2tt \/2tt ■ 

is the inverse Fourier transform of the delta function. 


(4.446) 


be defined by Now let | k) (k\ 


(\k)(k\g))(x) = (Gg)k-}=e ikx (4.447) 

\/ Ztt 

or 

\k)(k\g = (k\g)\k) (4.448) 

Then for the projection operators G~ 1 E y G in the spectral decomposition of p 
we can write 

{G~ l E y Gg){x) = f (Gg)k—^=e ikx dk = f (\k) (k\g))(x) dk (4.449) 

Jk<y V27T Jk<y 


or 


G^EyG = f (\k)(k\dk 

J k<y 


and for the spectral decomposition of the operators p we get 


(4.450) 


p= J hk \k) (k\ dk 


(4.451) 


which is the same spectral decompositions in terms of eigenvalues and eigenvec¬ 
tors that we saw earlier. 
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4.22. Problems 


4.22.1. Simple Basis Vectors 

Given two vectors 


A — 7e\ + 6 S 2 , B — — 2ei + 16e2 


written in the {ei,e 2 } basis set and given another basis set 

„ V n/3, , \/3„ 1, 

69 - 2 ei+_ 2~ e2 ’ e P ~ ~^~ ei + 2 62 

(a) Show that e q and e p are orthonormal. 

(b) Determine the new components of A, B in the {e q ,e p } basis set. 

4.22.2. Eigenvalues and Eigenvectors 

Find the eigenvalues and normalized eigenvectors of the matrix 

/ 1 2 4 > 

4= 2 3 0 

\ 5 0 3 , 

Are the eigenvectors orthogonal? Comment on this. 

4.22.3. Orthogonal Basis Vectors 

Determine the eigenvalues and eigenstates of the following matrix 

/ 2 2 0 

4=1 2 1 

l 1 2 1 

Using Gram-Schmidt, construct an orthonormal basis set from the eigenvectors 
of this operator. 

4.22.4. Operator Matrix Representation 

If the states {11), |2) |3)} form an orthonormal basis and if the operator G has 
the properties 

G|1> = 2|1)-4|2> + 7|3> 

G|2) = -211) + 3|3) 

G|3) = 1111) + 212) — 613) 

What is the matrix representation of G in the |1), |2) |3) basis? 
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4.22.5. Matrix Representation and Expectation Value 

If the states {| 1) ,|2) |3)} form an orthonormal basis and if the operator K has 
the properties 

AT|1> = 2|1> 

V|2>=3|2) 

V|3> = -6|3> 


(a) Write an expression for K in terms of its eigenvalues and eigenvectors (pro¬ 
jection operators). Use this expression to derive the matrix representing 
K in the |1), |2) |3) basis. 

(b) What is the expectation or average value of K, defined as (a| K |a), in the 
state 

|a) = 4^(-3|l> + 5|2) + 7|3» 


4.22.6. Projection Operator Representation 

Let the states {11), |2) |3)} form an orthonormal basis. We consider the operator 
given by P 2 = |2) (2|. What is the matrix representation of this operator? What 
are its eigenvalues and eigenvectors. For the arbitrary state 

|4>=-i=(-3|l> + 5|2) + 7|3» 


What is the result of P 2 |A}? 


4.22.7. Operator Algebra 

An operator for a two-state system is given by 

P = a(|l)(l|-|2)(2| + |l>(2| + |2>(l|) 

where a is a number. Find the eigenvalues and the corresponding eigenkets. 


4.22.8. Functions of Operators 

Suppose that we have some operator Q such that Q\q) = q\q), i.e., | q) is an 
eigenvector of Q with eigenvalue q. Show that \q) is also an eigenvector of the 
operators Q 2 , Q n and and determine the corresponding eigenvalues. 
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4.22.9. A Symmetric Matrix 

Let A be a 4 x 4 symmetric matrix. Assume that the eigenvalues are given by 
0, 1, 2, and 3 with the corresponding normalized eigenvectors 



/ 1 \ 


/ i \ 


/ 0 \ 


/ 0 

1 

0 

1 

0 

1 

1 

1 

1 

n/2 

0 

’ 71 

0 

’ 71 

1 

’ 71 

-1 


l 1 1 


l -1 ) 


l 0 j 


l 0 


Find the matrix A. 


4.22.10. Determinants and Traces 

Let A be an nxn matrix. Show that 

det(exp(A)) = exp(Tr(A)) 

4.22.11. Function of a Matrix 

Let 

A = ( 2 

Calculate exp(aA), a real. 

4.22.12. More Gram-Schmidt 

Let A be the symmetric matrix 



( 5 

-2 

- 4 

A = 

-2 

2 

2 


l -4 

2 

5 


Determine the eigenvalues and eigenvectors of A. Are the eigenvectors orthog¬ 
onal to each other? If not, find an orthogonal set using the Gram-Schmidt 
process. 



4.22.13. Infinite Dimensions 

Let A be a square finite-dimensional matrix (real elements) such that AA T = I. 

(a) Show that A T A = I. 

(b) Does this result hold for infinite dimensional matrices? 
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4.22.14. Spectral Decomposition 

Find the eigenvalues and eigenvectors of the matrix 


M = 


0 1 0 
1 0 1 
0 1 0 


Construct the corresponding projection operators, and verify that the matrix 
can be written in terms of its eigenvalues and eigenvectors. This is the spectral 
decomposition for this matrix. 


4.22.15. Measurement Results 

Given particles in state 


|a) = -^=(-3|l> + 5|2> + 7|3» 

where {| 1), |2), |3)} form an orthonormal basis, what are the possible experi¬ 
mental results for a measurement of 


Y = 


2 0 
0 3 
^ 0 0 


0 

0 

-6 


(written in this basis) and with what probabilities do they occur? 


4.22.16. Expectation Values 

Let 


represent an observable, and 


l*> 


a 

b 


be an arbitrary state vector(with |a| 2 + \b\ 2 


1). Calculate ( R 2 ) in two ways: 


(a) Evaluate (i? 2 ) = (T| i? 2 |T) directly. 


(b) Find the eigenvalues(ri and r 2 ) and eigenvectors(|ri) and |r 2 )) of R 2 or 
R. Expand the state vector as a linear combination of the eigenvectors 
and evaluate 


(R 2 ) = r 2 \ Cl \ 2 + r 2 \c 2 f 
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4.22.18. The World of Hard/Soft Particles 

Let us define a state using a hardness basis {| h) , |s)}, where 

Ohardness \h) = | h) , Ohardness |s) = - |s) 
and the hardness operator Ohardness is represented by (in this basis) by 

Ohardness- 

Suppose that we are in the state 

| A) - cos 9\h) + e lcp sin 9 |s) 

(a) Is this state normalized? Show your work. If not, normalize it. 

(b) Find the state | B) that is orthogonal to |H). Make sure \B) is normalized. 

(c) Express | h) and |s) in the ||H), | B)} basis. 

(d) What are the possible outcomes of a hardness measurement on state |4) 
and with what probability will each occur? 

(e) Express the hardness operator in the (|4), |J3)} basis. 

4.22.19. Things in Hilbert Space 

For all parts of this problem, let 71 be a Hilbert space spanned by the basis kets 
{|0), |1), |2), |3)}, and let a and b be arbitrary complex constants. 

(a) Which of the following are Hermitian operators on 71? 
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1. |o)(i| + ?;|i}(o| 

2. |0) <0| + |1) <1| + |2) (3| + |3) (2| 

3. (a|0) + |l)) + (a|0) + |1)) 

4. ((a|0) + ^|l)) + (b|0)-a*|l)))|2)(l| + |3)(3| 

5. |0) <0| + i |1> <0| - i|0) (1| + |1) (1| 

(b) Find the spectral decomposition of the following operator on H: 

k= |0> <0| + 211> <2| + 2 |2> <1| - |3> (3| 

(c) Let 14/) be a normalized ket in "H, and let I denote the identity operator 
on H. Is the operator 

B=-^ ( 7 + |*}<*|) 

a projection operator? 

(d) Find the spectral decomposition of the operator B from part (c). 

4.22.20. A 2-Dimensional Hilbert Space 

Consider a 2-dimensional Hilbert space spanned by an orthonormal basis {|f), ||)}. 
This corresponds to spin up/down for spin= 1/2 as we will see later in Chapter 
9. Let us define the operators 

= ^(lt>UI + |l>(t|) , 4 = |(lt>UI-HXtl) , ^ |(ItXtl-14X41) 

(a) Show that each of these operators is Hermitian. 

(b) Find the matrix representations of these operators in the {|f), |f)} basis. 

(c) Show that [SA S/,] = ihS z , and cyclic permutations. Do this two ways: 
Using the Dirac notation definitions above and the matrix representations 
found in (b). 

Now let 

|±> = -^(|t>±|j» 

(d) Show that these vectors form a new orthonormal basis. 

(e) Find the matrix representations of these operators in the ||+), |-}} basis. 

(f) The matrices found in (b) and (e) are related through a similarity trans¬ 
formation given by a unitary matrix, U, such that 

^ n) =C/ t 5( ±) U , s£ n) = U'S^U , 

where the superscript denotes the basis in which the operator is repre¬ 
sented. Find U and show that it is unitary. 
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Now let 


5 ± = *(5 x ±i5 y ) 

(g) Express S ± as outer products in the {|f), ||)} basis and show that S\ = <§_. 

(h) Show that 

S+ |4) = |t> , 5_|t) = |l> , 5-|0 = 0,5 + |t> = 0 

and find 

(t| S+ , <t| S+ , (t| S- , 0| 5_ 

4.22.21. Find the Eigenvalues 

The three matrices M x , M y , M z , each with 256 rows and columns, obey the 
commutation rules 

[M i; Mj] = ih£ ijk M k 

The eigenvalues of M z are ±2 h (each once), ±2 h (each once), ±3ft/2 (each 8 
times), ±h (each 28 times), ±h/2 (each 56 times), and 0 (70 times). State the 
256 eigenvalues of the matrix M 2 = M 2 + M 2 + M 2 . 

4.22.22. Operator Properties 

(a) If O is a quantum-mechanical operator, what is the definition of the cor¬ 
responding Hermitian conjugate operator, 0 + ? 

(b) Define what is meant by a Hermitian operator in quantum mechanics. 

(c) Show that d/dx is not a Hermitian operator. What is its Hermitian con¬ 
jugate, ( d/dx) + ? 

(d) Prove that for any two operators A and B, ( AB) + - B + A + , 

4.22.23. Ehrenfest’s Relations 

Show that the following relation applies for any operator O that lacks an explicit 
dependence on time: 

HINT: Remember that the Hamiltonian, H, is a Hermitian operator, and that 
H appears in the time-dependent Schrodinger equation. 

Use this result to derive Ehrenfest’s relations, which show that classical me¬ 
chanics still applies to expectation values: 

} ~(x) = (p) , ^(p) = -(VU> 
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4.22.24. Solution of Coupled Linear ODEs 

Consider the set of coupled linear differential equations x = Ax where x = 
(aq, X 2 , X 3 ) e R 3 and 

/0 1 1 
A = 1 0 1 
d 1 0 

(a) Find the general solution x(t ) in terms of :r(0) by matrix exponentiation. 

(b) Using the results from part (a), write the general solution x(t ) by expand¬ 
ing x(0) in eigenvectors of A. That is, write 

x(t) = e Xl c\V\ + e X 2 C 2 V 2 + e X 3 c^vz 

where (A i,Ui) are the eigenvalue-eigenvector pairs for A and the Cj are 
coefficients written in terms of the x(0). 


4.22.25. Spectral Decomposition Practice 

Find the spectral decomposition of the matrix 



0 

0 

—i 


0 


i 

0 


4.22.26. More on Projection Operators 

The basic definition of a projection operator is that it must satisfy P 2 = P. If 
P furthermore satisfies P = P + we say that P is an orthogonal projector. As 
we derived in the text, the eigenvalues of an orthogonal projector are all equal 
to either zero or one. 

(a) Show that if P is a projection operator, then so is I - P. 

(b) Show that for any orthogonal projector P and an normalized state, 0 < 

(P>< I- 

(c) Show that the singular values of an orthogonal projector are also equal to 
zero or one. The singular values of an arbitrary matrix A are given by the 
square-roots of the eigenvalues of A + A. It follows that for every singular 
value (7,; of a matrix A there exist some unit normalized vector Uj such 
that 

u^A + Au.i = <t 2 

Conclude that the action of an orthogonal projection operator never length¬ 
ens a vector (never increases its norm). 
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For the next two parts we consider the example of a non-orthogonal pro¬ 
jection operator 



(d) Find the eigenvalues and eigenvectors of N. Does the usual spectral de¬ 
composition work as a representation of N? 

(e) Find the singular values of N. Can you interpret this in terms of the 
action of N on vectors in R 2 ? 
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Chapter 5 


Probability 


5.1. Probability Concepts 

Quantum mechanics will necessarily involve probability in order for us to make 
the connection with experimental measurements. 

We will be interested in understanding the quantity 

P(A\B) = probability of event A given that event B is true 

In essence, event B sets up the conditions or an environment and then we ask 
about the (conditional) probability of event A given that those conditions exist. 
All probabilities are conditional in this sense. The | symbol means given so that 
items to the right of this conditioning symbol are taken as being true. 

In other words, we set up an experimental apparatus, which is expressed by 
properties B and do a measurement with that apparatus, which is expressed 
by properties A. We generate numbers (measurements) which we use to give a 
value to the quantity P(A\B). 

5.1.1. Standard Thinking . 

We start with the standard mathematical formalism based on axioms. We define 
these events: 

1. A = occurrence of A 

(denotes that proposition A is true) 

2. ~ A = NOT A = nonoccurrence of A 
(denotes that proposition A is false) 

3. A n B = A AND B = occurrence of both A and B 
(denotes that proposition A AND B is true) 


323 




4. A u B = A OR B =occurrence of at least A and B 
(denotes that proposition A OR B is true) 

and standard Boolean logic as shown below: 

Boolean logic uses the basic statements AND, OR, and NOT. 
Using these and a series of Boolean expressions, the final 
output would be one TRUE or FALSE statement. 

This is illustrated below: 

1. If A is true AND B is true, then (A AND B) is true 

2. If A is true AND B is false, then (A AND B ) is false 

3. If A is true OR B is false, then (A OR B) is true 

4. If A is false OR B is false, then (A OR B) is false 

or written as a truth table: 


A 

B 

(AnP) 

(A u B) 

1 

1 

1 

1 

1 

0 

0 

1 

0 

1 

0 

1 

0 

0 

0 

0 


Table 5.1: Boolean Logic 


where 1 = TRUE and 0 = FALSE. 

Then we set up a theory of probability with these axioms: 

1. P(A|A) = 1 

This is the probability of the occurrence A given the occurrence of A. This 
represents a certainty and, thus, the probability must = 1. This is clearly 
an obvious assumption that we must make if our probability ideas are to 
make any sense at all. 

In other words, if I set the experimental apparatus such that the meter 
reads A, then it reads A with probability = 1. 

2. 0 < P(A\B) < P(B\B) - 1 

This just expresses the sensible idea that no probability is greater than 
the probability of a certainty and it make no sense to have the probability 
be less than 0. 

3. P(A\B) + P(~ A|P) = 1 or P(~ A|P) = 1 - P(A|P) 

This just expresses the fact that the probability of something (anything) 


324 




happening (A or ~ A) given B is a certainty (= 1), that is, since the set A 
or ~ A includes everything that can happen, the total probability that one 
or the other occurs must be the probability of a certainty and be equal to 
one. 

4. P(An B\C) = P(A\C)P(B\ArC) 

This says that the probability that 2 events A , B both occur given that C 
occurs equals the probability of A given C multiplied by the probability 
of B given (AnC), which makes sense which makes sense if you think of 
them happening in sequence. 

All other probability relationships can be derived from these axioms. 

The nonoccurrence of A given that A occurs must have probability = 0. This is 
expressed by 

P(~ A\A) = 0 (5.1) 

This result clearly follows from the axioms since 

P(A\B) + P(~ A\B) = 1 (5.2) 

P(A|A) + P(~ A\A) = 1 (5.3) 

P(~ A\A) = 1 - P(A\A) = 1-1 = 0 (5.4) 

Example: Let us evaluate P(X n Y\C) + P(X n ~ Y\C). 

We use axiom (4) in the 1 st term with and in the 2 nd term with to get 

P(X n Y\C) + P(Xr ~ Y\C) (5.5) 

= P(X\C)P(Y\X n C) + P(X\C)P(~ Y\X n C) 

= P(X\C)[P(Y\X n C) + P(~ Y\X n C)] 


= P{X\C)[1\ = P(X\C) 

where we have used axiom (3). Thus we have the result 

P(X n Y\C) + P(Xn ~ Y\C) = P(X\C) (5.6) 

Now let us use this result with X =~ A,Y =~ B. This gives 

P(~ An ~ B\C) + P(~ An ~~ B\C) = P(~ A\C) (5.7) 

P(~ An ~ B\C) + P(~ An P|C) = 1 - P(A|C) (5.8) 

P(~ An ~ B\C) = 1 - P(A|C) - P(~ An P|C) (5.9) 

Then use the result again with X = B,Y =~ A. This gives 

P(Bn ~ A\C) + P(B n A\C) = P(B\C) (5.10) 
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or 


P(~ A n B\C) = P(B\C) - P{A n B\C) 


(5.11) 


which gives 

P(~ An ~ B\C) = 1 - P(A\C) - P(B\C) + P(A n B\C) (5.12) 

Now 

P(A u B\C) = l-P(~(Au P)|C7) = 1 - P((~ An ~ B)|C7) (5.13) 

since 

~(AuB) = (~ An ~ P) (5.14) 

i.e., we can construct a truth table as shown below, which illustrates the equality 
directly 


A 

B 

(~ (A uP)) 

(~ An ~ B ) 

1 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

1 

0 

0 

1 

1 


Table 5.2: Equivalent Expressions 


We finally get 

P(A u B) = P(A\C ) + P(B\C) - P(A n B\C ) (5.15) 

which is a very important and useful result. 

If we have P(AnB\C)=0, then events A and B are said to be mutually exclusive 
given that C is true and the relation then reduces to 

P(AuB) = P(A\C) + P(B\C) (5.16) 

This is the rule of addition of probabilities for exclusive events. 

Some other important results are: 

If A n B = B n A, |; then P(A\C)P(B\A nC) = P(B\C)P(A\B n C) (5.17) 

If P(A\C) * 0, then P{B\A n C) = P(A|P n C) (5.18) 

which is Baye’s theorem. It relates the probability of B given A to the proba¬ 
bility of A given B. 

When we say that B is independent of A, we will mean 

P(B\A n C) = P(B\C) (5.19) 
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or the occurrence of A has NO influence on the probability of B given C. Using 
axiom (4) we then have the result: 

If A and B are independent given C, then 

P(A n B\C) - P(A\C)P(B\C) (5.20) 

This is called statistical or stochastic independence. The result generalizes to a 
set of events {A, ; , i = 1,2,..., n}. All these events are independent if and only 
if 

P(A 1 n A 2 n • •• n A m \C) = P(A 1 \C)P(A 2 \C)... P(A m \C) (5.21) 

for all m < n. 

Now let us think about these ideas in another way that has fundamental impor¬ 
tance in modern approaches to quantum theory. The fundamental result in this 
view will turn out to be the Bayes formula and its relationship to measurements. 

5.1.2. Bayesian Thinking . 

Two Different Axioms 

1. If we specify how much we believe something is true, then we must have 
implicitly specified how much we believe it is false. 

2. If we first specify how much we believe that proposition Y is true, and 
then state how much we believe X is true given that Y is true, then we 
must implicitly have specified how much we believe that both X and Y 
are true. 

We assign real numbers to each proposition in a manner so that the larger the 
numerical value associated with a proposition, the more we believe it. 

Only using the rules of Boolean logic, ordinary algebra, and the constraint that 
if there are several different ways of using the same information, then we should 
always arrive at the same conclusions independent of the particular analysis- 
path chosen, it is then found that this consistency could only be guaranteed 
if the real numbers we had attached to our beliefs in the various propositions 
could be mapped (or transformed) to another set of real positive numbers which 
obeyed the usual rules of probability theory: 

prob(X\I) + prob(~ X\I) = 1 (same as axiom (3)) (5.22) 

prob(X n Y\I) = prob(X\Y n I) x prob(Y\I)(same as axiom (4)) (5.23) 

The first of these equations is called the sum rule and states (as earlier) that 
the probability that X is true plus the probability that X is false is equal to 
one. 

The second of these equations is called the product rule. It states (as earlier) 
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that the probability that both X and Y are true is equal to the probability that 
X is true given that Y is true times the probability that Y is true (independent 
ofX). 

Note all the probabilities are conditional on proposition(s) or conditioning(s) I, 
which denotes the relevant background information on hand. It is important to 
understand that there is no such thing as an absolute probability (one without 
prior information). 


Baye’s Theorem and Marginalization 

As before, we can use the sum and product rules to derive other results. 

First, starting with the product rule we have 

prob(X n Y\I) = prob(X\Y n /) x prob(Y\I) (5.24) 

We can rewrite this equation with X and Y interchanged 

prob(Y n X\I) = prob(Y\X n I) x prob(X\I) (5.25) 

Since the probability that both X and Y are true must be logically the same as 
the probability that both Y and X are true we must also have 

prob(Y n X\I) = prob(X n Y\I) (5.26) 


or 

prob(X\Y n I) x prob(Y\I ) = prob(Y\X n I) x prob(X\I ) 
or 

prob(X\Y n 7) - ^(rlVn/lx ^oHXI/) 

prob(Y\I) 

which is Bayes theorem (as derived earlier). 


(5.27) 

(5.28) 


Most standard treatments of probability do not attach much importance to 
Bayes’ rule. 

This rule, which relates prob(A\BnC) to prob(B\AnC), allows us to turn things 
around with respect to the conditioning symbol, which leads to a reorientation 
of our thinking about probability. 


The fundamental importance of this property to data analysis becomes apparent 
if we replace A and B by hypothesis and data: 

prob(A\B nC) « prob(B\A nC)x prob(A\C ) (5.29) 

prob(hypothesis\data nC) oc prob{data\hypothesis nC)x prob(hypothesis\C ) 
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Note that the equality has been replaced with a proportionality because the 
term prob(data\I ) = evidence has been omitted. The proportionality constant 
can be found from the normalization requirement that the sum of the probabil¬ 
ities for something happening must equal 1. 

The power of Bayes’ theorem lies in the fact that it relates the quantity of in¬ 
terest, the probability that the hypothesis is true given the data, to the term 
that we have a better chance of being able to assign, the probability that we 
would have obtained the measured data if the hypothesis was true. 

The various terms in Bayes’ theorem have formal names. 

The term prob(hypothesis\C ) = prior probability represents our state of knowl¬ 
edge (or ignorance) about the truth of the hypothesis before we have analyzed 
the current data. This is modified by the experimental measurements through 
the term prob(data\hypot,hesis n C) = likelihood function. This product gives 
prob(hypothesis\data n C) = posterior probability representing our state of 
knowledge about the truth of the hypothesis in the light of the data(after mea¬ 
surements) . 

In some sense, Bayes’ theorem encapsulates the process of learning, as we shall 


see later. 

Second, consider the following results from the product rule 

prob(X n Y\I) = prob(Y n X\I) = prob(Y\X n I) x prob(X\I) (5.30) 

prob(X n ~ Y\I) = prob(~ Y n X\I) = prob(~ Y\X n I) x prob(X\I) (5.31) 
Adding these equations we get 

prob(X n Y\I) + prob(X n ~ Y\I) (5.32) 

= (prob(Y\X n I) + prob(~ Y\X n /)) prob(X\I) 

Since prob(Y\X n I) + prob{~ Y\X n I) - 1 we have 

prob(X n Y\I) + prob(X n ~ Y\I) = prob(X\I ) (5.33) 


which, again, is the same result as earlier. If, on the other hand, Y -» {Y^,k = 
1,2,... ,M} representing a set of M alternative possibilities, then we generalize 
the two-state result above as 

M 

prob(X n Yk\I) = prob(X\I ) (5.34) 

k= 1 


We can derive this result as follows 

prob(X n Y\\I) = prob(Y\ n X\I) = prob(Y\\X n I) x prob(X\I ) (5.35) 
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prob(X n Y 2 \I) = prob(Y 2 n X\I) = prob(Y 2 \X n I) x prob(X\I ) 


(5.36) 

(5.37) 

(5.38) 


prob(X n Ym\I) = prob(YM n X\I) = probiYu\X n I) x prob(X\I) 

Adding these equations we get 

m / m \ 

Y prob(X n Y k \I) = prob(X\I) I Y prob(Y k n X\I) (5.39) 

fc=l \fc=i / 

If we assume that the {Y k } form a mutually exclusive and exhaustive set of 
possibilities, that is, if one of the Yfs is true, then all the others must be false, 
we then get 


M 


Yj Prob(Y k n X\I) = I 

k =1 

which is a normalization condition. This completes the derivation. 


(5.40) 


If we go to the continuum limit where we consider an arbitrarily large number 
of propositions about some result (the range in which a given result might lie), 
then as long as we choose the intervals in a contiguous fashion, and cover a big 
enough range of values, we will have a mutually exclusive and exhaustive set of 
possibilities. In the limit of M -» oo, we obtain 


prob(X\I) = f prob(x rY\I) dY 

J — OO 


(5.41) 


which is the marginalization equation. The integrand here is technically a prob¬ 
ability density function (pdf) rather than a probability. It is defined by 


pdf(X n y = y\I ) = lim 

Sy-*0 


prob(X n y < Y <y + 5y\I ) 
5y 


(5.42) 


and the probability that the value of Y lies in a finite range between y\ and y 2 
(and X is also true) is given by 


ry 2 

prob(X ry 1 <Y < y 2 \I) = / pdf{XcY\I)dY (5.43) 

Jyi 

which leads directly to the marginalization equation. 

In this continuum limit the normalization condition takes the form 

1= [°° pdf (Y\X n I) dY (5.44) 

Marginalization is a very powerful device in data analysis because it enables us 
to deal with nuisance parameters, that is, quantities which necessarily enter the 
analysis but are of no intrinsic interest. The unwanted background signal present 
in many experimental measurements, and instrumental parameters which are 
difficult to calibrate, are examples of nuisance parameters. 
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5.2. Probability Interpretation 


In the standard way of thinking about probability in relation to experiments, 
measured results are related to probabilities using the concept of a limit fre¬ 
quency. The limit frequency is linked to probability by this definition: 

If C can lead to either A or ~ A and if in 
n repetitions, A occurs m times, then 

777 

P(A\C) = lim - (5.45) 

n-foo n 

We must now connect the mathematical formalism with this limit frequency 
concept so that we can use the formalism to make predictions for experiments 
in real physical systems. 

This approach depends on whether we can prove that the limit makes sense for 
real physical systems. Let us see how we can understand the real meaning of 
the above interpretation of probability and thus learn how to use it in quantum 
mechanics, where probability will be the dominant property. 

Suppose that we have an experimental measurement, M, that can yield either 
A or ~ A as results, with a probability for result A given by 

P(A\M)=p (5.46) 

In general, we let any sequence of n independent measurements be labeled as 
event M n and we define ua as the number of times A occurs, where 0 < ua < n. 

Now imagine we carry out a sequence of n independent measurements and we 
find that A occurs r times. The probability for a sequence of results that includes 
result A occurring r times and ~ A occurring ( n-r ) times (independent of their 
order in the sequence) is given by 


P r q n ~ r (5.47) 

where 

q = P(~A\M) = l-P(A\M) = l-p (5.48) 

The different sequence orderings are mutually exclusive events and thus we have 

P(ua = r\M n ) = £ p r q n ~ r (5.49) 

all possible orderings 

The sum 

E (5.50) 

all possible orderings 

just counts the number of ways to distribute r occurences of A and (n-r) oc¬ 
curences of ~ A, where all the terms contain the common factor p r q n ~ r . This 
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result is given by the Binomial probability distribution (more about this later) 
as 


n\ 

r!(n - r)\ 


(5.51) 


so that 

n) 

P(n A = r\M n ) = n ~ p r q n ~ r (5.52) 

r\[n - r)\ 

Now to get to the heart of the problem. The frequency of A in M n is given by 

fn = — (5.53) 

n 

This is not necessarily = p in any set of measurements. 


What is the relationship between them? Consider the following: 


{n A ) = average or expectation value 

= sum over [possible values times probability of that value] 

n! 


= y rP(n A = r\M n ) = Y, r 


r =0 


r=0 


r!(n - r)! 


p r q n r 


(5.54) 


We now use a clever mathematical trick to evaluate this sum. For the moment 
consider p and q to be two arbitrary independent variables. At the end of the 
calculation we will let q = 1 - p as is appropriate for a real physical system. 


From the Binomial expansion formula, we have, in general, 


n 


E 


r!(n - r)! 


p r q n r 


(p + q) n 


(5.55) 


We then have 


so that 


This gives 


or 


f) n ji ! f) 

E u ' v p r <r r =P^(p+<i) r 

op £r 0 r\(n-r)\ op 


77 i 

E r at— ~^r,p r< i n ~ r = n p(p + q) 

r =o r\(n-r)\ 


n -1 


^ rP(n A = r\M n ) = np(p + q ) r 

r =0 


(n A ) = np(p+q) 


n -1 


(5.56) 

(5.57) 


(5.58) 

(5.59) 


In a real physical system, we must have p + q 
result 

(n A )) = np 


1, so that we end up with the 
(5.60) 
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and 


(5.61) 


if \ ^ 

ifn) = - =P 

n 

This says that p = the average frequency. 

This does not say, however, that f n is close to p. 

Now consider a more general experiment where the outcome of a measurement 
is the value of some continuous variable Q, with probability density (for its 
continuous spectrum) given by 

P(q < Q < q + dq\M) = h(q)dq (5.62) 

If we let h(q) contain delta-functions, then this derivation is also valid for the 
discrete part of the spectrum. We can now derive the following useful result. If 
Q is a nonnegative variable, which means that h(q ) = 0 for q < 0, then for any 
e > 0 

(Q) = f h(q)qdq> f h(q)qdq> e [ h(q) dq = eP(Q > e\M) (5.63) 
J0 Je Je 

This implies that 

P(Q > e\M) < M 
e 

Now we apply this result to the nonnegative variable |Q 
c = number, to obtain 

P(\Q - c| > e| M) = P(\Q - c\ a > e a \M ) < (5.65) 

which is called Chebyshev’s inequality. 

In the special case where a = 2 ,c = (Q) mean of distribution, we have 

(|Q - c| 2 } = (|Q - (Q)| 2 ) = -(Q 2 ) - (Q 2 ) 2 = a 2 = variance (5.66) 
so that letting e = kcr we get 

P(\Q-(Q)\>ka\M)<~ (5.67) 

or, the probability of Q being k or more standard deviations from the mean is 
no greater than 1 /k 2 ( independent of the form of the probability distribution). 

In a similar manner, it can also be shown that 

P(\fn-p\>6\M)<^- (5.68) 

no z 

which implies that the probability of /„ (the relative frequency of A in n in¬ 
dependent repetitions of M ) being more than <5 away from p converges to 0 as 


(5.64) 

- c|“ where a > 0 and 
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n -» oo. This is an example of the law of large numbers in action. This DOES 
NOT say /„ = p at any time or that f n remains close to p as n -* oo. 

It DOES say that the deviation of f n from p becomes more and more improba¬ 
ble or that the probability of any deviation approaches 0 as n -> oo. 

It is in this sense that one uses the limit frequency from experiment to compare 
with theoretical probability predictions in physics. From probability theory one 
derives only statements of probability, not of necessity. 


5.3. First hints of “subversive” or “Bayesian” think¬ 
ing. 

How do we reason in situations where it is not possible to argue with certainty? 
In other words, is there a way to use the techniques of deductive logic to study 
the inference problem arising when using inductive logic? No matter what sci¬ 
entists say, this is what they are actually doing most of the time. 

The answer to this last question resides in the Bayes’ rule. 

To Bayes(along with Bernoulli and Laplace), a probability represented a“degree- 
of-belie” or “plausibility”, that is, how much one thinks that something is true, 
based on the evidence on hand. 

The developers of standard probability theory (Fisher, Neyman and Pearson) 
thought this seemed too vague and subjective a set of ideas to be the basis 
of a “rigorous” mathematical theory. Therefore, they defined probability as 
the long-run relative frequency with which an event occurred, given infinitely 
many repeated experimental trials. Since such probabilities can be measured, 
probability was then thought to be an objective tool for dealing with random 
phenomena. 

This frequency definition certainly seems to be more objective, but it turns out 
that its range of validity is far more limited. 

In this Bayesian view, probability represents a state of knowledge. The condi¬ 
tional probabilities represent logical connections rather than causal ones. 

Example: 

Consider an urn that contains 5 red balls and 7 green balls. 

If a ball is selected at random, then we would all agree that the probability of 
picking a red ball would be 5/12 and of picking a green ball would be 7/12. 
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If the ball is not returned to the urn, then it seems reasonable that the prob¬ 
ability of picking a red or green ball must depend on the outcome of the first 
pick (because there will be one less red or green ball in the urn). 

Now suppose that we are not told the outcome of the first pick, but are given 
the result of the second pick. Does the probability of the first pick being red or 
green change with the knowledge of the second pick? 

Initially, many observers would probably say no, that is, at the time of the first 
draw, there were still 5 red balls and 7 green balls in the urn, so the proba¬ 
bilities for picking red and green should still be 5/12 and 7/12 independent of 
the outcome of the second pick. The error in this argument becomes clear if 
we consider the extreme example of an urn containing only 1 red and 1 green 
ball. Although, the second pick cannot affect the first pick in a physical sense, 
a knowledge of the second result does influence what we can infer about the 
outcome of the first pick, that is, if the second ball was green, then the first ball 
must have been red, and vice versa. 

We can calculate the result as shown below: 


Y = pick is GREEN ( 2 nd pick) 

X = pick is RED (1 st pick) 

I = initial number of RED/GREEN balls = (n,m) 
A Bayesian would say: 


prob(X\Y n I) 


prob(Y\X n I) x prob(X\I) 
prob(Y\I ) 


(5.69) 


prob(X\Y n {n, m}) = 


prob(Y\X n {n,m}) x ^ 

n _ m _j_ m m— 1 

n+m n+m—1 n+m n+m— 1 

™ 1 x n n 


nm . m(m-l) n + TO - 1 


n+m—1 n+m—1 


n = m = 1 =► prob(X\Y n {1,1}) = 


1 + 1-1 


= 1 


n = 5, m = 7 => prob(X\Y n {5,7}) = ^ i = ^ = 0.456 


Non-Bayesian says: 


pro5(X|{5,7}) = - = 0.417 


Clearly, the Bayesian and Non-Bayesian disagree. 


(5.70) 

(5.71) 

(5.72) 

(5.73) 
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However, the non-Bayesian is just assuming that the calculated result 0.417 is 
correct, whereas, the Bayesian is using the rules of probability (Bayes’ Rule) to 
infer the result 0.456 correctly. 

The concerns about the subjectivity of the Bayesian view of probability are un¬ 
derstandable. I think that the presumed shortcomings of the Bayesian approach 
merely reflect a confusion between subjectivity and the difficult technical ques¬ 
tion of how probabilities(especially prior probabilities) should be assigned. 

The popular argument is that if a probability represents a degree-of-belief, then 
it must be subjective, because my belief could be different from yours. The 
Bayesian view is that a probability does indeed represent how much we believe 
that something is true, but that this belief should be based on all the relevant 
information available (all prior probabilities). 

While this makes the assignment of probabilities an open-ended question, be¬ 
cause the information available to me may not be the same as that available to 
you, it is not the same as subjectivity. It simply means that probabilities are 
always conditional, and this conditioning must be stated explicitly. 

Objectivity demands only that two people having the 
same information should assign the same probability. 

Cox looked at the question of plausible reasoning from the perspective of logical 
consistency. He found that the only rules that worked were those of probability 
theory! Although the sum and product rules of probability are straightforward 
to prove for frequencies (using Venn diagrams), Cox showed that their range 
of validity goes much further. Rather than being restricted to frequencies, he 
showed that probability theory constitutes the basic calculus for logical and 
consistent plausible reasoning, which means scientific inference! 

Another Example - Is this a fair coin? 

We consider a simple coin-tossing experiment. Suppose that I had found this 
coin and we observed 4 heads in 11 flips. 

If by the word fair we mean that we would be prepared to make a 50 : 50 bet 
on the outcome of a flip being a head or a tail, then do you think that it is a 
fair coin? 

If we ascribe fairness to the coin, then we naturally ask how sure are we that 
this was so or if it was not fair, how unfair do we think it was? 

A way of formulating this problem is to consider a large number of contiguous 
hypotheses about the range in which the bias-weighting of the coin might lie. 
If we denote bias-weighting by H, then H = 0 and H - 1 can represent a coin 
which produces a tail(not a head!) or a head on every flip, respectively. There 
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is a continuum of possibilities for the value of H between these limits, with 
H = 1/2 indicating a fair coin. The hypotheses might then be, for example 

(a) 0.00 < H < 0.01 

(b) 0.01 <H <0.02 

(c) 0.02 < H < 0.03 

and so on 

Our state of knowledge about the fairness, or the degree of unfairness, of the 
coin is then completely summarized by specifying how much we believe these 
various hypotheses to be true. If we assign a high probability to one (or a closely 
grouped few) of these hypotheses, compared to others, then this indicates that 
we are confident in our estimate of the bias-weighting. If there was no such 
distinction, then it would reflect a high level of ignorance about the nature of 
the coin. 


In this case, our inference about the fairness of the data is summarized by the 
conditional pdf prob(H\{data} H nl). This is just a representation of the limiting 
case of a continuum of hypotheses for the value of H, that is, the probability 
that H lies in an infinitesimally narrow range between h and h + 5h is given 
by prob{H = h\{data} n I) dH. To estimate this posterior pdf, we need to use 
Baye’s theorem, which relates the pdf of interest to two others that are easier 
to assign: 

prob{H\{data] n I) oc prob({data}\H n I) x prob(H\I) (5-74) 

We have omitted the denominator prob({data}\I) since it does not involve bias¬ 
weighting explicitly and replaced the equality by a proportionality. The omitted 
constant can be determined by normalization 

f prob(H\{data} n I) dH = 1 (5.75) 

Jo 

The prior pdf, prob(H\I), on the right side represents what we know about 
the coin given only that I found the coin. This means that we should keep an 
open mind about the nature of the coin. A simple probability assignment which 
reflects this is a uniform pdf 


prob(H\I) 


fl 0 < U < 1 

10 otherwise 


(5.76) 


This prior state of knowledge (or ignorance) is modified by the data through the 
likelihood function, prob({data}\H n I), which is a measure of the chance that 
we would have obtained the data we actually observed if the value of the bias¬ 
weighting H was given (as known). If, in the conditioning information I, we 
assume that the flips of the coin were independent events, so that the outcome 
of one did not influence that of another, then the probability of obtaining the 
data R heads in N tosses is given by the binomial distribution 

prob({data}\H n I) oc H R ( 1 - H) N ~ R (5.77) 
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The product of these last two results then gives the posterior pdf that we require. 

It represents our state of knowledge about the nature of the coin in light of the 
data. 

It is instructive to see how this pdf evolves as we obtain more and more data 
pertaining to the coin. A computer simulation is shown below allows us to 
demonstrate what happens in some typical cases. 

The simulation allows for three distinct and very different prior probabilities: 

(1) Uniform distribution 

(2) Gaussian distribution centered around 0.5 with some spread 

(3) Sum of two Gaussians with different centers 

These prior probabilities represent very different initial knowledge: 

(1) total ignorance-we have no idea if it is fair 

(2) knowledge that mean is 0.5[with spread]-we think it is fair 

(3) knowledge that it is unfair (either all tails or all heads) [with spreads] 

In the simulation we can choose the true mean value (ho), which is then reflected 
in the simulated coin tosses (the data). 

As can be seen from the images below, the only effect that different prior prob¬ 
abilities have is to change the period of time evolution to the final posterior pdf 
(which is the same eventually in all cases)! 
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(1) total ignorance - we have no idea if it is fair 



(a) Prior 


(b) Posterior 


Figure 5.1: Total Ignorance 


(2) knowledge that mean is 0.5 [with spread] - we think it is fair 


Posterior pdf=prob({dato}lh,l}*prob(hll) 




(a) Prior (b) Posterior 

Figure 5.2: Knowledge that Mean is 0.5 


(3) knowledge that it is unfair (either all tails or all heads) [with spreads] 



(a) Prior 


(b) Posterior 


Figure 5.3: Knowledge that it is Unfair 
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In each case, the first figure shows the posterior pdf for H given no data (it is 
the same as the prior pdf) and the second figure shows the posterior pdf after 
1000 tosses and they clearly indicate that no matter what our initial knowledge, 
the final posterior pdf will be the same, that is, the posterior pdf is dominated 
by the likelihood function(the actual data) and is independent of the prior pdf. 

5.3.1. The Problem of Prior Probabilities 

We are now faced with the most difficult question. How do we assign probabil¬ 
ities based on prior information? 

The oldest idea was devised by Bernoulli - the principle of insufficient reason 
or the principle of indifference. It states that if we determine a set of basic, 
mutually exclusive, possibilities, and we have no reason to believe that any one 
of them is more likely to be true than another, then we must assign the same 
probability to each of them. Clearly, this makes sense. Think of flipping a coin 
with two possibilities, heads and tails. If it is a legitimate coin, then we have 
no reason to favor heads over tails and we must assign equal probability to each 
possibility, that is, 


prob(heads\I ) = prob(tails\I) 


1 

2 


(5.78) 


Let us elaborate on the idea of not having any reason to believe . Suppose we 

had ten possibilities labeled by Xj, i = 1,2,..., 10 and we had no reason to think 
any was more likely than any other. We would then have 

prob(X 1 \I) = prob(X 2 \I) = ... = prob(X w \I) = (5.79) 


Suppose that we relabel or reorder the possibilities. If the conditioning on I 
truly represents gross ignorance about any details of the situation, then such a 
reordering should not make any difference in the probability assignments. Any 
other statement has to mean that we have other important information besides 
the simple ordering of the possibilities. For example, imagine that you called a 
certain side of the coin heads and therefore the other side tails. Nothing changes 
if your friend switches the meaning of heads and tails. This justification of the 
Bernoulli principle led Jaynes to suggest that we think of it as a consequence of 
the requirement of consistency. 


This principle of insufficient reason can only be applied to a limited set of 
problems involving games of chance. It leads, however, to some very familiar and 
very important results if combined with the product and sum rules of probability 
theory. 

Example 1: Assume W white balls and R red balls in an urn. We now pick 
the balls out of the urn randomly. The principle of indifference says that we 
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should assign a uniform prior probability (actually a pdf) 

prob(j\I) = -pr—r- , j = 1,2,3, ..^R + W (5.80) 

H + W 

for the proposition that any particular ball, denoted by index j, will be picked. 
Using the marginalization idea from earlier 

prob(X\I ) = f prob(X n Y\I) dY (5.81) 


we have 


R+w 

prob(red\I) = ^ prob{red n j\I) (5.82) 

i =i 

R+W I R+W 

= V prob(j\I)prob(red\j n I) = ——— V prob(red\j n I) 

3=1 R+W 3=1 


where we have used the product rule. The term prob(red\j n /) is one if the j th 
ball is red and zero if it is white. Therefore the summation equals the number 
of red balls R and we get 


1 R+w 

prob(red\I) = ——— V prob(red\j n I ) 
R + W j =1 


R 

R + W 


(5.83) 


as expected. However, we have derived this result from the principle of indiffer¬ 
ence and the product rule. It also follows from the basic notion of probability, 
that is, 


prob(red\I) 


number of cases favorable to red 
total number of equally possible cases 


R 

R+W 


(5.84) 


We now assume that after each pick the ball is returned to the urn and we ask 
the question: what is the probability that N such picks (trials) will result in r 
red balls? 


Using marginalization and the product rule we can write 

prob(r\Nr\I) = ^prob(rr\Sk\NnI) = ^ prob(r\Sk<^NnI)prob(Sk\NnI ) (5.85) 
fc k 

where the summation is over the 2 N possible sequences of red-white outcomes 
{Sk} of N picks. The term prob(r\Sk nNnl) equals one if S) c contains exactly 
r red balls and is zero otherwise, so that we need only consider those sequences 
which have exactly r red outcomes for prob(Sk\N n /). 

Now we have 

E>rTT/-iV-r 

prob(S k \N n I) = [ prob(red\I)] r [prob(white\I)] N ~ r = + N (5.86) 
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Hence, 


(5.87) 


prob(r\N n I) = ^ '+\y)N Y,P roh ( r \Sk n TV n /) 

for those S'*, that matter, i.e., we are only considering those Sk which contain 
exactly r red balls. In this case we have 


prob{r\N n I) = 


■rxxrN-r 


R r W 


TV! 


(R+W) N r\(N -r)\ 


(5.88) 


where the last factor just corresponds to the number of sequences (permutations) 
containing r red balls. Thus, 


prob(r\N n I) = 


TV! 


7P r (l ~P) 


N- 


(5.89) 


where 


and 


R 


p = 


q= 1 -p = 


R + W 
W 


r!(TV - r)! 

= probability of picking a red ball (5.90) 

= probability of picking a white ball (5.91) 


R + W 

Note that p+q = 1 as it should since red and white balls are the only possibilities. 


We can then compute the frequency r/N with which we expect to observe red 
balls. We have 


N 


N 


jf) = £ —prob(r\N n I) = £ 


TV! 


r= 0 
N 


N 


rg TV r\(N - r)! 


p r (l-p) 


N-r 


= E 


(TV-1)! N _ (TV-1)! ^ jv-i -7 

-7P iX-P) =PL 77777 ..m FI 1 -?) 


(r- 1)!(TV - ?•)! 


to j!(TV-r)! J 


= p(p + g) 


JV-l 


/? 


= P = 


i? +W 


(5.92) 


as the expected or anticipated result. Thus, the expected frequency of red balls, 
in repetitions of the urn experiment, is equal to the probability of picking one 
red ball in a single trial. 


A similar calculation for the mean-square deviation gives the result 



(5.93) 


Since this becomes zero in the limit of large TV, it agrees with the result we 
derived earlier. It also verifies that Bernoulli’s famous theorem or law of large 
numbers is valid: 


lim 

N-*-oo 


= prob(redjl) 


(5.94) 
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This relationship, which allows prediction of the long-run frequency of occur¬ 
rence from the probability assignment, goes in a direction opposite to the one we 
want, that is, we would like to be able to determine the probability of obtaining 
a red ball, in a single pick, given a finite number of observed outcomes. This is, 
in fact, exactly what Bayes theorem allows us to do! 

How do we generalize Bernoulli’s principle of insufficient reason to the case of 
continuous parameters, that is, when the quantity of interest is not restricted 
to certain discrete values (heads/tails)? 

Suppose we have a variable X which represents the position of some object. We 
then define a probability as follows. Given the information I, the probability 
that X lies in the infinitesimal range between x and x + dx is 

prob(X - x\I) = lim prob(x < X < x + 5x\I) (5.95) 

<5x^-0 

so that we are treating continuous pdfs as the limiting case of discrete ones. 
Although it is still awkward to enumerate the possibilities in this case, we can 
still make use of the principle of consistency which underlies the principle of 
indifference. 

Examples: 

A Location Parameter 

Suppose that we are unsure about the actual location of the origin. Should 
this make any difference to the pdf assigned for XI Since I represents gross 
ignorance about any details of the situation other than the knowledge that X 
pertains to a location, the answer must be no; otherwise we must already have 
information regarding the position of the object. Consistency then demands 
that the pdf for X should not change with the location of the origin or any 
offset in the position values. Mathematically, we say 

prob(X\I ) dX = prob(X + xo|/) d(X + Xo) (5.96) 

Since xo is a constant, d{X + xq) - dX so that we have 

prob(X\I ) = prob(X + xo|/) = constant (5.97) 

so that the complete ignorance about a location parameter is represented by the 
assignment of a uniform pdf. 

A Scale Parameter 

Suppose that we have another parameter that tells us about size or magnitude, a 
so-called scale parameter. If we are interested in the size L of some object and we 
have no idea about the length scale involved, then the pdf should be invariant 
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with respect to shrinking or stretching the length scale. Mathematically, the 
requirement of consistency can be written 

prob(L\I ) dL = prob(f3L\I ) d((3L) (5.98) 

where /3 is a positive constant. Then since d(f)L) = /3dL we must have 

prob(L\I ) = 0prob(/3L\I) (5.99) 

which can only be satisfied if 


prob(L\I) oc y (5.100) 

Li 

which is called Jeffrey’s prior. It represents complete ignorance about the value 
of a scale parameter. 

Now we must have 


prob(L\I) dL = prob(f{L)\I) df{L) (5.101) 

since we are looking at the same domain of values in each case. We then have 
prob(\ogL\I ) d(logL) = prob(L\I ) dL (5.102) 

prob{\ogL\I ) — - prob{L\I) dL (5.103) 

L 

prob{\ogL\I) = Lprob(L\I) = constant (5.104) 


So that assignment of a uniform pdf for log L is the way to represent complete 
ignorance about a scale parameter. 

5.4. Testable Information: 

The Principle of Maximum Entropy 

Clearly, some pdfs can be assigned given only the nature of the quantities in¬ 
volved (as we saw above). The methods employed hinge on the use of consistency 
arguments along with transformation groups, which characterize the ignorance 
for a given situation. 

For a set of discrete probabilities (finite) the associated pdf must be invariant 
with respect to any permutation of the propositions (permutation group). In 
the continuous parameter case, the associated transformations are translation 
(origin shift) and dilation (shrink/stretch), which are also group transforma¬ 
tions. 

Let us move on to a situation where we do not have total ignorance. 
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Suppose that a die, with the usual six faces, was rolled a very large number of 
times and we are only told that the average result was 4.5. What probability 
should we assign for the various outcomes {X,;} that the face on top had i dots? 

The information or condition I provided by the experiment is written as a simple 
constraint equation 

n 

^ l iprob(X i \I) = 4.5 (5.105) 

i =1 

If we had assumed a uniform pdf, then we would have predicted a different 
average 

n I n 

Y^iprob(Xi\I) = -^)i = 3.5 (5.106) 

i. i 6 i=1 

which means the uniform pdf is not a valid assignment. 

There are many pdfs that are consistent with the experimental results. Which 
one is the best? 

The constraint equation above is called testable information. 

With such a condition, we can either accept or reject any proposed pdf. Jaynes (one 
of the most brilliant theoretical physicists ever) proposed that, in this situation, 
we should make the assignment by using the principle of maximum entropy 
(MaxEnt), that is, we should choose that pdf which has the most entropy S 
while satisfying the available constraints. 

Explicitly, for case in the die experiment above, we need to maximize 

S = - Y,pd°g e (Pi) (5.107) 

2 = 1 

where p t = prob(Xi\I) subject to the conditions: 

(1) normalization constraint 


Y,Pi = 1 


2=1 


(5.108) 


and 


(2) testable information constraint 

6 

= 4.5 (5.109) 

2=1 

Such a constrained optimization is done using the method of Lagrange multipli¬ 
ers as shown below. 
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Define the functions 


6 df 

/(Pi) = E^- 1 = 0 ^T Lsl (5.HO) 

<=i dPj 

6 do 

g(Pi) = E*Pi _4 - 5 = 0 ^ =3 (5.111) 

i =1 dpi 

The maximization problem can then be written in the following way. Instead 
we maximize the quantity 


S + Xff + X g g = S (by definition) (5.112) 


where the constants are called undetermined or Lagrange multipliers. Thus the 
maximization equation becomes 

+ Xf -L + X g -^=0 j = 1,2,3,4,5,6 (5.113) 

opj apj opj 

We get the equations 

-log e ( ft )-l + A /+ jA s = 0 j= 1,2,3,4,5,6 (5.114) 

and we obtain 


-iogefe+i) - 1 + A f + (j + 1)A g = -log e (pj) - 1 + A/ + jXg 
This implies that 

log e - X g => = fj = constant 

Pj Pi 

This gives 

-log e (pi) - 1 + X f + j log/3 = 0 => X f = 1 + log e 

Therefore 

Ep* = 1=Pi(1 + /3 + /3 2 + /3 3 +/3 4 + /3 5 ) 

i =1 

6 

E iPi = 4.5 = pi(l + 2/3 + 3/3 2 + 4/3 3 + 5/3 4 + 6/3 5 ) 

i= 1 

or dividing to get rid of pi we have 

1 + 2/3 + 3/3 2 + 4/3 3 + 5/3 4 + 6/3 5 
1 + /3 + /3 2 + /3 3 + /3 4 + /3 5 “ ' 

which gives 

1.5/3 5 + 0.5/3 4 - 0.5/3 3 - 1.5^ 2 - 2.5/3 - 3.5 = 0 


(5.115) 

(5.116) 

(5.117) 

(5.118) 

(5.119) 

(5.120) 

(5.121) 
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Solving numerically for j3 we get 1.449255 so that 


Pi 

P2 

P3 

Pi 

Pb 

P6 


1 

1 + /3 + t3 2 + /3 3 + /3 4 + (3 5 

I3pi = 0.07877 

(ip-2 = 0.11416 

Pp 3 = 0.16545 

/3p 4 = 0.23977 

Pp 5 = 0.34749 


0.05435 


(5.122) 


is the MaxEnt assignment for the pdf for the outcomes of the die roll, given 
only that it has the usual six faces and yields an average result of 4.5. 


Why should the entropy function 

S =-^PihgeiPi) (5.123) 

i=l 

specified above be the choice for a selection criterion? 


Let us look at two examples that suggest this criterion is highly desirable and 
probably correct. 

Kangaroo Problem(Gull and Skilling) 

The kangaroo problem is as follows: 

Information: 1/3 of all kangaroos have blue eyes and 1/3 of all kangaroos are 
left-handed 


Question: On the basis of this information alone, what proportion of kangaroos 
are both blue-eyed and left-handed? 

For any particular kangaroo, there are four distinct possibilities, namely, that 
it is 

(1) blue-eyed and left-handed 

(2) blue-eyed and right-handed 

(3) not blue-eyed but left-handed 

(4) not blue-eyed but right-handed 

Bernoulli’s law of large numbers says that the expected values of the frac¬ 
tion of kangeroos with characteristics (l)-(4) will be equal to the probabilities 
(pi,P 2 ,P 3 ,P 4 ) we assign to each of these propositions. 
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This is represented by a 2 x 2 truth or contingency table as shown below: 



Left-handed True 

Left-handed False 

Blue-Eyed True 

Pi 

P2 

Blue-Eyed False 

P3 

PA 


Table 5.3: GeneralTruth Table 


Although there are four possible combinations of eye-color and handedness to 
be considered, the related probabilities are not completely independent of each 
other. We have the standard normalization requirement 

X>< = 1 (5T2 4 ) 


In addition, we also have two conditions on the so-called marginal probabilities 
Pi +P 2 = prob(blue n left\I) + prob(blue n right\I) = 1/3 (5.125) 

Pi + P 3 = prob(blue n left\I) + prob(not - blue n left\I) = 1/3 (5.126) 

Since any pi > 0, these imply that 0 < pi < 1/3. Using this result we can 
characterize the contingency table by a single variable x = p\ as in the table 
below: 



Left-handed True 

Left-handed False 

Blue-Eyed True 

0 < x < 1/3 

1/3 - x 

Blue-Eyed False 

1/3 - x 

1/3 + x 


Table 5.4: For Kangaroo Problem 


where we have used 


x = p\ 


Pi + P2 = - 


Pl+P3 = 


■p 2 =--x 


■PS--. 


Pi + P2 + P3 + Pi = 1 Pa = ^ + x 


(5.127) 

(5.128) 

(5.129) 

(5.130) 


All such solutions, where 0 < x < 1/3, satisfy the constraints of the testable 
information that is available. Which one is best? 


Common sense leads us towards the assignment based on independence of these 
two traits, that is, any other assignment would indicate a knowledge of kangaroo 
eye-color told us something about its handedness. Since we have no information 
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to determine even the sign of any potential correlation, let alone its magnitude, 
any choice other than independence is not justified. 


The independence choice says that 

x = pi = prob(blue n left\I) = prob(blue\I)prob(left\I ) = - (5.131) 

9 

In this particular example it was straightforward to decide the most sensible 
pdf assignment in the face of the inadequate information. 

We now ask whether there is some function of the {p{\ which, when maximized 
subject to the known constraints, yields the independence solution. The im¬ 
portance of finding an answer to this question is that it would become a good 
candidate for a general variational principle that could be used in situations 
that were too complicated for our common sense. 


Skilling has shown that the only function which gives x = 1/9 is the entropy S 
as specified above or 


S = -Zp* 1o § e(Pi) 


(5.132) 


i= 1 


= ~xlog e (x) -2^ -xj log e Q -xj - Q + xj log e Q + x) 

The results of Skilling’s investigations, including three proposed alternatives, 


4 

51 = - Y,Pi log e (pi) => MaxEnt 

i= 1 

52 = -tp1 S3 = - 

i=1 i=l 

is shown in the table below: 


4 


^4 = - £ s/pi 
2=1 


(5.133) 


Function 

Optimal x 

Implied Correlation 

SI 

0.1111 

None 

S2 

0.0833 

Negative 

S3 

0.1301 

Positive 

S4 

0.1218 

Positive 


Table 5.5: Skilling Results 


Clearly, only the MaxEnt assumption leads to an optimal value with no corre¬ 
lations as expected. 

Let us look at another example that lends further support to the MaxEnt prin¬ 
ciple. 
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The Team of Monkeys 

Suppose there are M distinct possibilities {Xi} to be considered. How can we 
assign truth tables (prob(Xi\I) = pi) to these possibilities given some testable 
information / (experimental results). 

What is the most honest and fair procedure? 

Imagine playing the following game. 

The various propositions are represented by different boxes all of the same size 
into which pennies are thrown at random. The tossing job is often assigned 
to a team of monkeys under the assumption that this will not introduce any 
underlying bias into the process. 

After a very large number of coins have been distributed into the boxes, the 
fraction found in each of the boxes gives a possible assignment of the probabil¬ 
ity for the corresponding {Xi}. 

The resulting pdf may not be consistent with the constraints of I, of course, in 
which case it must be rejected as a potential candidate. If it is in agreement, 
then it is a viable option. 

The process is then repeated by the monkeys many times. After many such 
trials, some distributions will be found to come up more often than others. The 
one that occurs most frequently (and satisfies I) would be a sensible choice for 
prob{{Xi}\I). 

This is so because the team of monkeys has no axe to grind (no underlying 
bias) and thus the most frequent solution can be regarded as the one that best 
represents our state of knowledge. It agrees with all the testable information 
available while being as indifferent as possible to everything else. 

Does this correspond to the pdf to the greatest value of S = - Y,Pi log e (pi) ? 

After the monkeys have tossed all the pennies given to them, suppose that we 
find 7ii in the first box, 712 in the second box, and so on. We then have 

M 

N = Y}rii = total number of coins (5.134) 

i =1 

which will be assumed to be very large and also much greater than the number 
of boxes M. 

This distribution gives rise to the candidate pdf {p,} for the possibilities {Xi}: 

, »=1,2,...,M (5.135) 
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Since every penny can land in any of the boxes there are M N number of dif¬ 
ferent ways of tossing the coins among the boxes. Each way, by assumption of 
randomness and no underlying bias by the monkeys, is equally likely to occur. 
All of the basic sequences, however, are not distinct, since many yield the same 
distribution {rii}. The expected frequency F with which a set {p,} will arise, is 
given by 




number of ways of obtaining!?!^} 
M N 


(5.136) 


The numerator is just the number of ways to distribute TV coins in a distribution 
{rii} which is given by 


number of ways of obtaining) n,;} 


TV! 

ni!n 2 !.. 


Putting everything together we have 




number of ways of obtaining! n,} 


m 


rii!ri2!...njyf! 


(5.137) 


(5.138) 


M 

log (F) = -TV log (M) + log (IV!) - X) log (n*!) (5.139) 

i =1 

Using Stirling’s approximation log (nj) « nlog (n) - n for large n, we find 


log (F) 


M M 

-N log (M) + TV log (TV) - Y, rii log (n*) -N+Y^i 

i =1 i=l 

M 

-N log (M) + iVlog (N) -^rii log (rii) (5.140) 

i =1 


and thus 


M 

log (F) = -TV log (M) + AT log (N) - log ( Pi N) 

i= 1 
M 

= -N log (M) + TVlog (TV) - Y, TV (log ( Vi ) + log (TV)) 

i= 1 

M M 

= -N log (Af) + TV log ( N) - N Y Pr log (Pi) ~ N log (N) Y Pi 

i=l i =1 

M 

= -N log (Af) + N log ( N) - N Y Pi log (Pi) ~ N log (N) 

i=l 

M 

= -TVlog (Af) - NYPi log C Pi) (5-141) 

i= 1 

Maximizing the log (A) is equivalent to maximizing F, which is the expected 
frequency with which the monkeys will come up with the candidate pdf {p,:}, 
that is, maximizing log ( F ) will give us the assignment prob({Xi}\I ) which best 
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represents our state of knowledge consistent with the testable information I. 
Since M and N are constants, this is equivalent to the constrained maximization 
of the entropy function 

s =-Y,Pi l °Se(Pi) (5.142) 

and so we recover the MaxEnt procedure once again. 

5.5. Discussion 

In discussions of Bayesian methods, opponents often use the words subjective 
probabilities to say that the methods are not as valid as normal objective prob¬ 
ability theory. 

These opponents are misguided. 

The main point of concern centers around the choice of the prior pdf, that is, 
what should we do if it is not known? 

This is actually a very strange question. It is usually posed this way by oppo¬ 
nents of the Bayesian methods in an attempt to prove its subjective nature. 

No probability, whether prior, likelihood or whatever, is ever known. It is simply 
an assignment which reflects the relevant information that is available. Thus, 
prob(x\Ii ) + prob{x\I- 2 ), in general, where the conditioning statements I\ and I 2 
are different. 

Nevertheless, objectivity can, and must, be introduced by demanding the two 
people with the same information I should assign the same pdf. I think that 
this consistency requirement is the most important idea of all. 

Invariance arguments, under transformation groups, can be used to uniquely 
determine a pdf when given only the nature of the quantities involved. MaxEnt 
provides a powerful extension when we have testable constraints. 

While we may yet be far from knowing how to convert every piece of vague 
information into a concrete probability assignment, we can deal with a wide 
variety of problems with these ideas. 

The important point is that nowhere in our discussion have we explicitly dif¬ 
ferentiated between a prior and a likelihood. We have only considered how to 
assign prob(X\I) for different types of I. If X pertains to data, then we call 
prob(X\I ) a likelihood. If neither X nor I refers to (new) measurements, then 
we may say it is a prior. 

The distinction between the two cases is one of nomenclature and not of objec¬ 
tivity or subjectivity. If it appears otherwise, then this is because we are usually 
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prepared to state conditioning assumptions for the likelihood function but shy 
away from doing likewise for the prior pdf. 

The use of Bayesian methods in quantum mechanics presents a very different 
view of quantum probability than normally appears in quantum theory text¬ 
books. It is becoming increasingly important in discussions of measurement. 


5.6. Problems 

5.6.1. Simple Probability Concepts 

There are 14 short problems in this section. If you have not studied any prob¬ 
ability ideas before using this book, then these are all new to you and doing 
them should enable you to learn the basic ideas of probability methods. If you 
have studied probability ideas before, these should all be straightforward. 

(a) Two dice are rolled, one after the other. Let A be the event that the 
second number if greater than the first. Find P(A). 

(b) Three dice are rolled and their scores added. Are you more likely to get 9 
than 10, or vice versa? 

(c) Which of these two events is more likely? 

1. four rolls of a die yield at least one six 

2. twenty-four rolls of two dice yield at least one double six 

(d) From meteorological records it is known that for a certain island at its 
winter solstice, it is wet with probability 30%, windy with probability 
40% and both wet and windy with probability 20%. Find 

(1) Prob( dry) 

(2) Prob {dry AND windy) 

(3) Prob (wet OR windy) 

(e) A kitchen contains two fire alarms; one is activated by smoke and the 
other by heat. Experiment has shown that the probability of the smoke 
alarm sounding within one minute of a fire starting is 0.95, the probability 
of the heat alarm sounding within one minute of a fire starting is 0.91, and 
the probability of both alarms sounding within one minute is 0.88. What 
is the probability of at least one alarm sounding within a minute? 

(f) Suppose you are about to roll two dice, one from each hand. What is 
the probability that your right-hand die shows a larger number than your 
left-hand die? Now suppose you roll the left-hand die first and it shows 5. 
What is the probability that the right-hand die shows a larger number? 
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(g) A coin is flipped three times. Let A be the event that the first flip gives a 
head and B be the event that there are exactly two heads overall. Deter¬ 
mine 

(1) P(A\B) 

(2) P(B\A) 

(h) A box contains a double-headed coin, a double-tailed coin and a conven¬ 
tional coin. A coin is picked at random and flipped. It shows a head. 
What is the probability that it is the double-headed coin? 

(i) A box contains 5 red socks and 3 blue socks. If you remove 2 socks at 
random, what is the probability that you are holding a blue pair? 

(j) An inexpensive electronic toy made by Acme Gadgets Inc. is defective 
with probability 0.001. These toys are so popular that they are copied 
and sold illegally but cheaply. Pirate versions capture 10% of the market 
and any pirated copy is defective with probability 0.5. If you buy a toy, 
what is the chance that it is defective? 

(k) Patients may be treated with any one of a number of drugs, each of which 
may give rise to side effects. A certain drug C has a 99% success rate 
in the absence of side effects and side effects only arise in 5% of cases. 
However, if they do arise, then C only has a 30% success rate. If C is 
used, what is the probability of the event A that a cure is effected? 

(l) Suppose a multiple choice question has c available choices. A student 
either knows the answer with probability p or guesses at random with 
probability 1 - p. Given that the answer selected is correct, what is the 
probability that the student knew the answer? 

(m) Common PINs do not begin with zero. They have four digits. A computer 
assigns you a PIN at random. What is the probability that all four digits 
are different? 

(n) You are dealt a hand of 5 cards from a conventional deck(52 cards). A 
full house comprises 3 cards of one value and 2 of another value. If that 
hand has 4 cards of one value, this is called four of a kind. Which is more 
likely? 


5.6.2. Playing Cards 

Two cards are drawn at random from a shuffled deck and laid aside without 
being examined. Then a third card is drawn. Show that the probability that 
the third card is a spade is 1/4 just as it was for the first card. HINT: Consider 
all the (mutually exclusive) possibilities (two discarded cards spades, third card 
spade or not spade, etc). 
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5.6.3. Birthdays 

What is the probability that you and a friend have different birthdays? (for 
simplicity let a year have 365 days). What is the probability that three people 
have different birthdays? Show that the probability that n people have different 
birthdays is 


p = (i-— )(i-—)(i-—) (i- 

V 365 A 365 A 365/ V 


n - 1\ 
365 j 


Estimate this for n « 365 by calculating log(p) (use the fact that log(l + x) * x 
for x « 1). Find the smallest integer N for which p < 1/2. Hence show that 
for a group of N people or more, the probability is greater than 1/2 that two of 
them have the same birthday. 


5.6.4. Is there life? 

The number of stars in our galaxy is about N = 10 11 . Assume that the proba¬ 
bility that a star has planets is p = 10 2 , the probability that the conditions on 
the planet are suitable for life is q = 1CT 2 , and the probability of life evolving, 
given suitable conditions, is r = 10~. These numbers are rather arbitrary. 

(a) What is the probability of life existing in an arbitrary solar system (a star 
and planets, if any)? 

(b) What is the probability that life exists in at least one solar system? 

5.6.5. Law of large Numbers 

This problem illustrates the law of large numbers. 

(a) Assuming the probability of obtaining heads in a coin toss is 0.5, compare 
the probability of obtaining heads in 5 out of 10 tosses with the probability 
of obtaining heads in 50 out of 100 tosses and with the probability of 
obtaining heads in 5000 out of 10000 tosses. What is happening? 

(b) For a set of 10 tosses, a set of 100 tosses and a set of lOOOOfosses, calculate 
the probability that the fraction of heads will be between 0.445 and 0.555. 
What is happening? 


5.6.6. Bayes 

Suppose that you have 3 nickels and 4 dimes in your right pocket and 2 nickels 
and a quarter in your left pocket. You pick a pocket at random and from it 
select a coin at random. If it is a nickel, what is the probability that it came 
from your right pocket? Use Baye’s formula. 
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5.6.7. Psychological Tests 

Two psychologists reported on tests in which subjects were given the prior in¬ 
formation: 

I - In a certain city, 85% of the taxicabs 
are blue and 15% are green 

and the data: 


D = A witness to a crash who is 80% reliable (i.e., 
who in the lighting conditions prevailing can 
distinguish between green and blue 80% of the 
time) reports that the taxicab involved in the 
crash was green 


The subjects were then asked to judge the probability that the taxicab was 
actually blue. What is the correct answer? 


5.6.8. Bayes Rules, Gaussians and Learning 


Let us consider a classical problem(no quantum uncertainty). Suppose we are 
trying to measure the position of a particle and we assign a prior probability 
function, 


p(x) 


1 p~( x ~x o) 2 / 2 °o 

V 27rcr o 


Our measuring device is not perfect. Due to noise it can only measure with a 
resolution A, i.e., when I measure the position, I must assume error bars of this 
size. Thus, if my detector registers the position as y, I assign the likelihood that 
the position was a; by a Gaussian, 


p(y\x) 


1 P -(y-x) 2 / 2A 2 

\/27rA 2 


Use Bayes theorem to show that, given the new data, I must now update my 
probability assignment of the position to a new Gaussian, 


p(x\y) 


1 p -(x-x') 2 /2</ 2 

VZttct ' 2 


where 


x' - Xq + Ki(y - Xq) , a 2 = I\ 2 al , Ki = 


a o + A 2 


K> = 


a o + A 2 


Comment on the behavior as the measurement resolution improves. How does 
the learning process work? 
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5.6.9. Berger’s Burgers-Maximum Entropy Ideas 

A fast food restaurant offers three meals: burger, chicken, and fish. The price, 
Calorie count, and probability of each meal being delivered cold are listed below 
in Table 5.1: 


Item 

Entree 

Cost 

Calories 

Prob(hot) 

Prob(cold) 

Meal 1 

burger 

$1.00 

1000 

0.5 

0.5 

Meal 2 

chicken 

$2.00 

600 

0.8 

0.2 

Meal 3 

fish 

$3.00 

400 

0.9 

0.1 


Table 5.6: Berger’s Burgers Details 


We want to identify the state of the system, i.e., the values of 

Prob(burger) = P(B ) 

Prob(chicken) = P(C ) 

Prob(fish) = P(F ) 

Even though the problem has now been set up, we do not know which state the 
actual state of the system. To express what we do know despite this ignorance, or 
uncertainty, we assume that each of the possible states A,; has some probability 
of occupancy P(A,), where i is an index running over the possible states. As 
stated above, for the restaurant model, we have three such possibilities, which 
we have labeled P(P), P(C), and P(F). 

A probability distribution P{Af) has the property that each of the probabilities 
is in the range 0 < P(Aj) < 1 and since the events are mutually exclusive and 
exhaustive, the sum of all the probabilities is given by 


1 = E P (A) (5.143) 

i 

Since probabilities are used to cope with our lack of knowledge and since one 
person may have more knowledge than another, it follows that two observers 
may, because of their different knowledge, use different probability distributions. 
In this sense probability, and all quantities that are based on probabilities are 
subjective. 

Our uncertainty is expressed quantitatively by the information which we do not 
have about the state occupied. This information is 

S = (5.144) 

This information is measured in bits because we are using logarithms to base 2. 
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In physical systems, this uncertainty is known as the entropy. Note that the en¬ 
tropy, because it is expressed in terms of probabilities, depends on the observer. 

The principle of maximum entropy (MaxEnt) is used to discover the probabil¬ 
ity distribution which leads to the largest value of the entropy (a maximum), 
thereby assuring that no information is inadvertently assumed. 

If one of the probabilities is equal to 1, the all the other probabilities are equal 
to 0, and the entropy is equal to 0. 

It is a property of the above entropy formula that it has its maximum when 
all the probabilities are equal (for a finite number of states), which the state of 
maximum ignorance. 

If we have no additional information about the system, then such a result seems 
reasonable. However, if we have additional information, then we should be able 
to find a probability distribution which is better in the sense that it has less 
uncertainty. 

In this problem we will impose only one constraint. The particular constraint 
is the known average price for a meal at Berger’s Burgers, namely $1.75. This 
constraint is an example of an expected value. 

(a) Express the constraint in terms of the unknown probabilities and the prices 
for the three types of meals. 

(b) Using this constraint and the total probability equal to 1 rule find possible 
ranges for the three probabilities in the form 

a < P(B) < b 
c < P(C) < d 
e < P(F) < f 

(c) Using this constraint, the total probability equal to 1 rule, the entropy 
formula and the MaxEnt rule, find the values of P(B), P(C), and P(F) 
which maximize S. 

(d) For this state determine the expected value of Calories and the expected 
number of meals served cold. 

In finding the state which maximizes the entropy, we found the probability dis¬ 
tribution that is consistent with the constraints and has the largest uncertainty. 
Thus, we have not inadvertently introduced any biases into the probability es¬ 
timation. 
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5.6.10. Extended Menu at Berger’s Burgers 

Suppose now that Berger’s extends its menu to include a Tofu option as shown 
in Table 5.2 below: 


Entree 

Cost 

Calories 

Prob(hot) 

Prob(cold) 

burger 

$1.00 

1000 

0.5 

0.5 

chicken 

$2.00 

600 

0.8 

0.2 

fish 

$3.00 

400 

0.9 

0.1 

tofu 

$8.00 

200 

0.6 

0.4 


Table 5.7: Extended Berger’s Burgers Menu Details 


Suppose you are now told that the average meal price is $2.50. 

Use the method of Lagrange multipliers to determine the state of the system 
(i.e., P(B), P(C), P(F ) and P(T)). 

You will need to solve some equations numerically. 


5.6.11. The Poisson Probability Distribution 

The arrival time of rain drops on the roof or photons from a laser beam on a 
detector are completely random, with no correlation from count to count. If 
we count for a certain time interval we won’t always get the same number - it 
will fluctuate from shot-to-shot. This kind of noise is sometimes known as shot 
noise or counting statistics. 


Suppose the particles arrive at an average rate R. In a small time interval 
At « 1/R no more than one particle can arrive. We seek the probability for n 
particles to arrive after a time t, P(n,t). 

(a) Show that the probability to detect zero particles exponentially decays, 
P(0,f) = e~ m . 

(b) Obtain a differential equation as a recursion relation 

— P(n, t ) + RP(n, t ) = RP{n - 1, t ) 
at 


(c) Solve this to find the Poisson distribution 


P(n,t ) 


(Rtr m 

nl 


Plot a histogram for Rt = 0.1,1.0,10.0. 
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(d) Show that the mean and standard deviation in number of counts are: 

(n) = Rt , a n = s/Rt = \/ (n) 

[HINT: To find the variance consider (n(n- 1))]. 

Fluctuations going as the square root of the mean are characteristic of 
counting statistics. 

(e) An alternative way to derive the Poisson distribution is to note that the 
count in each small time interval is a Bernoulli trial(find out what this is), 
with probability p = RAt to detect a particle and 1 -p for no detection. 
The total number of counts is thus the binomial distribution. We need to 
take the limit as At -*• 0 (thus p -*■ 0) but Rt remains finite (this is just 
calculus). To do this let the total number of intervals N = t/At -» oo while 
Np = Rt remains finite. Take this limit to get the Poisson distribution. 


5.6.12. Modeling Dice: Observables and Expectation Val¬ 
ues 

Suppose we have a pair of six-sided dice. If we roll them, we get a pair of results 

oe {1,2,3,4,5,6} , {1,2,3,4,5,6} 

where a is an observable corresponding to the number of spots on the top face 
of the first die and b is an observable corresponding to the number of spots on 
the top face of the second die. If the dice are fair, then the probabilities for the 
roll are 


Pr(a = 1) = Pr{a = 2) = Pr{a = 3) = Pr{a - 4) = Pr(a = 5) = Pr(a = 6) 
Pr(b = 1) = Pr(b = 2) = Pr(b = 3) = Pr(b = 4) = Pr(b = 5) = Pr(b = 6) = 

Thus, the expectation values of a and b are 


/ \ vk . 1 + 2 + 3 + 4+5+6 . 

(a) = £ iPr(a = t) = ---= 7/2 

1=1 0 

... vVr, /7 \ 1 + 2 + 3 + 4+ 5 + 6 , 

(b) = E lPr (b = *) =-^-= 7/2 


6 


Let us define two new observables in terms of a and b: 


= 1/6 
1/6 


s = a + b , p= ab 

Note that the possible values of s range from 2 to 12 and the possible values of 
p range from 1 to 36. Perform an explicit computation of the expectation values 
of s and p by writing out 

12 

(s) = E iPr(s - i ) 

i =2 
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and 


36 

(p) = Z iPr (p = 0 

i= 1 

Do this by explicitly computing all the probabilities Pr(s = i) and Pr(p = i). 
You should find that ( s ) = (a) + (6) and (p) = (a)(6). Why are these results not 
surprising? 

5.6.13. Conditional Probabilities for Dice 

Use the results of Problem 5.12. You should be able to intuit the correct answers 
for this problems by straightforward probabilistic reasoning; if not you can use 
Baye’s Rule 

_ , i . Pr(y\x)Pr(x) 

Pr <*> ■ 

to calculate the results. Here Pr(x\y) represents the probability of x given y, 
where x and y should be propositions of equations (for example, Pr(a = 2|s = 8) 
is the probability that a = 2 given the s = 8). 

(a) Suppose your friend rolls a pair of dice and, without showing you the result, 
tells you that s = 8. What is your conditional probability distribution for 
a? 

(b) Suppose your friend rolls a pair of dice and, without showing you the 
result, tells you that p = 12. What is your conditional expectation value 
for s? 

5.6.14. Matrix Observables for Classical Probability 

Suppose we have a biased coin, which has probability ph of landing heads-up 
and probability pt of landing tails-up. Say we flip the biased coin but do not 
look at the result. Just for fun, let us represent this preparation procedure by 
a classical state vector 



(a) Define an observable (random variable) r that takes value +1 if the coin 
is heads-up and -1 if the coin is tails-up. Find a matrix R such that 

XqRx 0 = (r) 

where (r) denotes the mean, or expectation value, of our observable. 

(b) Now find a matrix F such that the dynamics corresponding to turning the 
coin over (after having flipped it, but still without looking at the result) 
is represented by 

x 0 Fx o 
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and 


(?’} !-► Xq F t RFxq 

Does U = F t RF make sense as an observable? If so explain what values 
it takes for a coin-flip result of heads or tails. What about RF or F T R ? 

(c) Let us now define the algebra of flipped-coin observables to be the set V 
of all matrices of the form 

v = aR + bR? , a,be R 

Show that this set is closed under matrix multiplication and that it is 
commutative. In other words, for any v±,V2 e V, show that 

Ui,u 2 eV r , V1V2 = V2V1 

Is U in this set? How should we interpret the observable represented by 
an arbitrary element v € VI 
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Chapter 6 

The Formulation of Quantum Mechanics 


6.1. Introduction 

A physical theory is 

1. a set of basic physical concepts 

2. a mathematical formalism 

3. a set of connections between the physical concepts and the mathematical 
objects that represent them 

The process of doing theoretical physics involves: 

1. constructing a mathematical formalism that allows us to express a physical 
problem or experiment in terms of mathematical objects 

2. solving the mathematical system representing the physical problem - this 
a pure mathematics at this point 

3. translating the mathematical solution to the physical world using the set 
of connections 

A description of a physical system will involve three ingredients: 

1. variables or measurable quantities that characterize the system 

2. states that describe values of the variables as a function of time 

3. equations of motion that determine how the states variables change in 
time 

In classical mechanics, the position of a particle, which is a physical concept, 
is connected to a set of real numbers, which is a mathematical object. This 
does not, in general, present us with any conceptual difficulties because we are 
familiar with both positions and numbers from our everyday experiences. 
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In quantum mechanics, however, the mathematical formalism that we will be 
discussing, is very abstract. In addition, we lack any intuition based on experi¬ 
ence concerning quantum phenomena. The everyday world that we live in does 
not appear to consist of Hermitian operators and infinite-dimensional vectors. 

Throughout the last century, many experiments have shown that the various 
dynamical variables, which seem to have a continuous spectrum in classical me¬ 
chanics, generally have completely discrete or quantized spectra or, at least, 
a spectrum that is both discrete and continuous. In quantum mechanics, this 
property will lead to most of the so-called non-classical results and force us to 
work in a world of probabilities. 

We will take an approach to the theoretical formalism of quantum mechanics 
presented in terms of postulates. 


6.2. Two Postulates: 

Presentation, Rationalization and Meaning 


Postulate 1 

For each dynamical variable or observable, which is a physical 
concept, there corresponds a Hermitian, linear operator, 
which is a mathematical object. 

The possible values of any measurement of the observable are 
restricted to the eigenvalues of the corresponding operator. 

It is the nature of a postulate that we can only make ourselves feel good about 
it or, in other words, make ourselves feel that it makes sense in some way. A 
priori, we cannot justify it in any way and we certainly cannot prove it is true or 
we would have assumed a more fundamental postulate (the one we used to prove 
the truth). A posteriori, we justify postulates by their effect on our predictions 
about experimental results. My philosophy of theoretical physics is based on 
this statement 

If the predictions agree with experiment, 
on certain quantum systems, then the 
postulates are valid for that class of systems. 

So what can we say about the 1st postulate? We already know that linear op¬ 
erators possess both discrete and continuous spectra. In addition, we have a 
vast storehouse of mathematical knowledge available about linear operators. So 
making this choice certainly is a sensible (and the simplest) place to start. 
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Note that the postulate does not give us any rules for assigning operators to 
observables. 

Singling out the eigenvalues is also sensible since they represent a special con¬ 
nection to the properties of the associated linear operator. Choosing Hermitian 
operators also makes sense since this guarantees that we have only real eigen¬ 
values representing measured quantities. 

Now we need to deal with states. The mathematical object that we connect to 
a physical state must allow us to calculate the probability distributions for all 
observables. 

Before stating the 2nd postulate, we need to present some new mathematical 
objects and ideas. 

TrW=l = Y l W kk = Y,( ( t>k\W\<t>k) ( 6 . 1 ) 

k k 

where Wkk is the diagonal matrix element (in the basis) of the density operator 
W. 

Some Properties of the Trace: 

Tr(AB) = Tr(BA) (6.2) 

Tr(cB) = cTr(B) (6.3) 

Tr(c(A + B)) = Tr(cA) + Tr(cB) = cTr(A) + cTr(B) (6.4) 

If we denote the eigenvalues of W by Wk and the corresponding eigenvectors by 
I'iUfc) so that 

W\w k ) = w k \wk) (6.5) 

then, since W has a pure point spectrum, we can write W in terms of its 
eigenvalues and eigenvectors (spectral representation) as 

W =Y J ' w k\w k ) {w k \ ( 6 . 6 ) 

k 

Since W is self-adjoint, its eigenvectors form an orthonormal basis where 


{w k \wj) = 8 k j (6.7) 

We now derive some properties of this density operator mathematical object. 

The spectrum of W is the set of numbers {wk}- We then have 

TrW = 1 = Y, (wj\W\wj) ( 6 . 8 ) 

j 

= E ( w j\ w j K) = E w o ( w j I w j) = E w o 

3 3 3 
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So, 


(6.9) 


E w i = 1 

3 

W is a bounded, Hermitian(self-adjoint) operator. Since W is self-adjoint, this 
means that W = W ', which implies that the eigenvalues are real, i.e., w k = 
Using the fact that W is defined to be a positive operator, we then have 

(a\W\a) = (a\ (w k \ \ a) = £ w k (a\w k ) (w k \a) (6.10) 

V k ) k 

= E^fcl ( a \ w k) | 2 > 0 

k 


for any vector |a). 

This can only be true, in general, if 



w k > 0 for all k 

(6.11) 

The results 

w k > 0 for all k and ^ Wj = 1 

3 

(6.12) 

0 < w k < 1 

Finally, for any other operator B 

(6.13) 


WB = E®t \ u ’k) {w k \B 

(6.14) 


k 


We then have 

Tr(WB) = E( u, jl|E u, *l M, fcX w; fel^jl M, f) ( 6 - 15 ) 

= Y, W k E ( W j\ W k) {Wk\B\Wj) = E w k Y, S 3k (w k \B\wj) 

k j k j 

= Y, w k {w k \B\w k ) 

k 

which is a weighted sum (where the eigenvalues of the density operator are the 
weight factors) of the diagonal matrix elements of the operator B in a basis 
consisting of eigenvectors of W. 

Before we can show how this connects to probabilities, we must discuss the 
concept of a state and its representation as a mathematical object. 

In classical mechanics, a state refers to the set of coordinates and momenta of 
the particles in the system. 

In quantum mechanics, we say that a state describes the values of a set of 
measurable quantities. The state description must be probabilistic so that we 
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can only know the probabilities that these values might occur. 

In essence, we assume (this is the standard approach to quantum mechanics) 
that the quantum state description refers to an ensemble of similarly prepared 
systems(so that we can make probability statements) and NOT to any individual 
system in the ensemble. We note that there is some dispute about this point. 

We are, thus, identifying the state directly with a set of probability distributions. 

Our earlier discussion(Chapter 5) of the experimental limit frequency and its 
connection to probability indicated that if we know the average (or mean) value 
or the expectation value, which has the symbol (...), for all of the observables 
in our physical system, we have then specified the state of the system as exactly 
as is possible. 

Some properties that we require of expectation values in order that they make 
physical sense as average values are: 

1. if B is self-adjoint, then (B) is real 

2. if B is positive, then (B) is nonnegative 

3. if c = complex number, then ( cB) = c(B) 

4. ( A + B) = {A) + {B ) 

5. </> = 1 
Postulate 2 

(a) A density operator exists for every real physical system. 

This rather innocuous looking postulate is at the heart of quantum theory. Its 
full meaning will only become clear as we learn its connection to the probability 
interpretation of quantum mechanics. 

(b) The expectation value of an operator B is given by 

(B) = Tr(WB) (6.16) 

If we assume that every bounded, Hermitian operator represents some measur¬ 
able quantity for a physical system, then each state is represented by a unique 
density operator W. 

Let us choose a simple example of a density operator to get some handle on what 
this postulate is saying. In particular, let us choose as our density operator IV 
the projection operator onto a 1-dimensional subspace spanned by the vector 
|a), namely 

W=\a){a\ (6.17) 
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This is an idempotent operator since W 2 = W and thus it has eigenvalues 
Wk = 0,1 only. The eigenvector corresponding to eigenvalue 1 is |a). We also 
have 

Y t w k = l = Tr(W) (6.18) 

fc 

(a| W |a> = | (a | a) | 2 > 0 (6.19) 

so that all required properties for being a density operator are satisfied. 

Since, 

0 < Wk < 1 and Y w k -l- (6.20) 

fc 

we can think of 

W = Y, w k \Wk) (Wk\ (6-21) 

k 

as representing a statistical mixture of the states represented by the vectors | Wk) 
where the probability is Wk that we have the state \wk) present. 

In this simple case, we then have 

(B) = Tr(WB ) = (a| B |a) (6.22) 

which is the expectation value of B in the state |a). 

Proof: Let |tui) = \/3) , |w 2 ) = |a) , (a|/3) = 0. Then 
2 2 

Tr(WB)= Y, (w m \WB\w m ) = Y ( w m\a) {a\B\w m ) (6.23) 

m= 1 7n=l 

2 2 

= Y ( w m\w 2 ) {w 2 \B\Wm) = Y ^m 2 (w 2 \B \w m ) 

171=1 771=1 

= {w 2 \B\w 2 ) = {a\B\a) 

Since the important quantities for connection to experiment will be expectation 
values, we see that the state represented by W, in this case, is equally well 
represented by the state vector |a); the density operator and the state vector 
are equivalent ways of representing a physical system in this simple case. 

In general, when 

w = Yj w k I Wk) (Wk\ (6.24) 

k 

we saw earlier that 

Tr(WB ) = Y, w ki w k\B\wk) (6.25) 

k 

Postulate #2 then says that the average or expectation value of W, (B) , equals 
the weighted sum of the expectation value in each of the eigenstates | Wk) of W 
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with a weighting factor equal to the corresponding eigenvalue Wk- 

Thus, if we know the numbers ( Wk\ B \Wk) we can find the expectation value of 
the corresponding operator, which is the maximum amount of information we 
can know. 


6.3. States and Probabilities 

6.3.1. States 

Among the set of all possible states there exists a special group called pure 
states. The density operator for a pure state is given by 

W = \ip)(ip\ (6.26) 

where the vector | tp) is of unit norm and is called the state vector. We dis¬ 
cussed this density operator above and found that the expectation value for an 
observable represented by Q now in this pure state |f/>), is given by 

(Q) = Tr(WQ) = Tr(\ip) (V>| Q) = (^| Q \i/>) (6.27) 

Note that the density operator is independent of this arbitrary phase. 

As we saw earlier the density operator W = \ip) (tp\ for the pure state is idempo- 
tent and thus the only possible eigenvalues are Wk = 0,1 and therefore, the form 
chosen for W agrees with the expansion in eigenvectors and eigenvalues. 

Another property of a pure state goes as follows: we have, in the general case, 

0 < Wk < 1 and ^Wk = 1 (6.28) 

k 

which implies that 

wl < Wk (6.29) 

so that 

Tr(W 2 ) = M < J> fe = 1 (6.30) 

k k 

or 

Tr(W 2 ) < 1 (6.31) 

in this case. For a pure state, however, where w 2 . = Wk because W is an idem- 
potent projection operator, we then have 

Tr(W 2 ) = 1 (6.32) 

The most important way of distinguishing whether a state is pure or not follows 
from this property of density operators: 


369 



The density operator for a pure state cannot be 
written as a linear combination of the density 
operators of other states, but the density operator 
for a nonpure state can always be so written. 


We can see that this is true by assuming the opposite and getting a contradiction. 
So let the density operator for a pure state be written as a linear combination 
of density operators for other states 


w 

= E^ 0) - o< Cj -<i , E c i = 1 

3 3 

(6.33) 

Now 

Tr(W 2 ) = EE c » c i T K^ (l) ^ (,) ) 

i 3 

(6.34) 

Now each density operator has a spectral representation of the form 


W (J) = E W 1 J) \ W n ) ){ W n ) \ 

n 

(6.35) 

Substitution gives 



Tr{W {i) W {j] 

’) = EE U, n )w m )Tr ’( W n ) )( W n ) | U; m ) )( U, m ) | 

Tl TYl 

(6.36) 


n m 

^ E W n 0 E W m ) = 1 

n m 

(6.37) 

This then gives 



Tr(W 2 ) = Y,T, c i c J Tr (W (i) W lj) ) < E c * E c i = 1 

i 3 i 3 

(6.38) 

But we assumed that W represents a pure state and the equality must hold. 
The only way for this to be true is as follows: 

iK°| 

w = 1 for all m, n such that + 0 

(6.39) 

The Schwarz inequality | (a. 16) | 2 < (a\a)(b\b) = 1 then says that 

and 


|zcrn j can differ at most by a phase factor. But by assumption, the eigenvectors 
for each separate density operator are orthogonal. The only conclusion then is 
that they are all the same operator, i.e., 

W (*) = W ^ for all i and j (6.40) 

This contradicts the assumption that we could write a pure state density oper¬ 
ator as a linear combination of other density operators. In fact all the density 
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operators must be the same. 

Are pure states more fundamental than nonpure states? 

Can we regard nonpure states as some kind of mixture of pure states? 

The answer depends on whether we require uniqueness. It turns out that we 
cannot write nonpure states uniquely in terms of pure states. For example let 
us consider the density operator given by 



W( a ) =a\u) (u\ + (1 - a) |v) (u| 

(6.41) 

with 0 < a < 1 and 



H 

u) = (v | v) = 1 and (u \ v) = (v \ u) = 0 

(6.42) 

Now define two other vectors \x) and | y) as linear combinations of 

| u) and \v). 


\x) = \/a\u) + \/l - a |v) 

(6.43) 


1 y) = s/a\u) - Vl - a|v) 

(6.44) 


u) = (v | v) = 1 and (u \ v) = (v \ u) = 0 

(6.45) 

such that we still have 



( x 1 

x ) = (y 1 y) = 1 and ( x 1 y) = (y 1 x ) = o 

(6.46) 

We then have 

tP(a) = \ \x){x\ + ^\y){y\ 

(6.47) 


so the linear combination is not unique. In fact, there are an infinite number of 
ways to do it. 


We will assume that both pure and nonpure states are fundamental and we will 
see physical systems later on that are described by each type. Due to the lack 
of uniqueness, we will not call the nonpure state a mixture or mixed state but 
just stick with the name nonpure. 

6.3.2. Probabilities 

Postulate #2 says that the average value of an observable represented by Q in 
the state corresponding to density operator W is given by 

(Q) = Tr{WQ ) (6.48) 

Consider a function F(Q) of the operator Q. Let 

h(q ) dq = probability that the measured value of the observable 
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represented by Q lies between q and q + dq 


Thus, 


h(q) = probability density function 

(6.49) 

Now, the general definition of an average value says that 


r~ oo 

<F(Q)}= / F(q')h(q')dq' 

J — OO 

(6.50) 

Postulate #2, however, says 


(F(Q)) = Tr(WF(Q)) 

(6.51) 

Taken together, these two relations will allow us to extract the probability den¬ 
sity function h(q). 

Case 1 : Discrete Spectrum 


Let Q be a self-adjoint operator representing the observable Q. Assume that it 
has a pure discrete spectrum with eigenvectors |g„) and corresponding eigenval¬ 
ues q n such that 

Q \qn) — qn \qn) 

(6.52) 

We can write its spectral representation (express the operator 
eigenvectors and eigenvalues) as 

in terms of its 

Q ~ ^jln | In) {dn\ 

n 

(6.53) 

Now consider the function 


F(Q) = 9(q ~ Q) 

(6.54) 

where 


Ota-Q)-! 1 " >Q 

’ \0 q<Q 

(6.55) 


The expectation value of this function is then 


/ oo n oo 

F{q')h(q')dq' = / 9(q - q')h(q') dq 

oo J — oo 

= f 9 h(q') dq' = Prob(Q < q\W) 

J — OO 

= probability that Q < q given W 


(6.56) 
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Alternatively, we can also write 


(9(q-Q))=Tr(W9(q-Q))=Tr W Z 9{q-q n ) \q n ) (q n \ (6-57) 

n 

= 'Z(l m \iwY / S(Q-Qn)\qn){qn\\\q m ) 

m \ n ) 

= Z E <9m| w I q n ) 9(q - q n ) (q n \ q m ) 

m n 

= Z E (9m| w I q n )9(q - q n )Snm = Z (ln\ W I q n ) 9(q - q n ) 

m n n 

= Prob(Q < q\W) 


Now we have 
h(q) 


d 


h( q ')dq'=^Prob(Q<q\W ) 
J oo oq 


d_ 

dq J- 

S- Z (9nl W \q n )0(q - q n ) = E <9nl ^ kn) <5(9 - 9n) 

Oq n n 


(6.58) 


This implies that the probability density h(q) = 0 unless q = an eigenvalue. In 
other words, the probability = 0 that a dynamical variable or observable will 
take on a value other than an eigenvalue of the corresponding operator. This 
makes it clear the postulate #2 is consistent with the last part of postulate #1, 
where we assumed this to be the case. 


Now 

Prob(Q = q\W) = probability that the observable 
represented by Q will have the discrete value 
q in the ensemble characterized by W 


We calculate this as follows: 

Prob(Q = q\W ) = lim(Prob(Q < q + e\W) - Prob(Q <q- e|W)) (6.59) 

€->•0 

= (Z <9nl W \q n ) 9(q + e - q n ) - £ <9nl W \q n ) 9(q-e-q n ) ) 

= Z (9n| W kn) lim(0(g + e + q n ) - 9(q + e - q n )) 

= Z(9n| W \q n )lim(S(q - q n )2e) = Z <9n| W kn) <W 

n €_>u n 


If we now define the projection operator 

P(,q) ~ Z kn) (9ra| &q,q„ (6-60) 

n 
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which projects onto the subspace spanned by all the eigenvectors (degenerate) 
with eigenvalue q = q n , we then have the general result that 


Tr(WP(q)) 


Eteml WP(q) | q m ) 


Y.(<lrn\W\Y J \<ln){(ln\Sq,q n km) 


4 


E km I W | Qn) ^q,q n (Qn \ Qm) 

m,n 


E (Qm\ kn) ^q.qn^nm 
m,n 


E kn| W kn) &q,q n = Prob(Q = q\W) 


( 6 . 61 ) 


If we have a pure state, then 

W = IV’XV’I (6-62) 

and if q n is a non-degenerate eigenvalue, 

Prob(Q = q n \W) = Tr(WP(fe)) = Tr((|0) M)P(«„)) (6.63) 

= E(9m|(IV , )(V’l)P(9n)km) 

m 

= E (9m| (IV’) (V’lXkn) (9n|)) l^m) 

m 

= E(9m|V’)W’l9n)(9nkm) 

m 

= E(9m|V’)(^kn><^nm 

m 

= (9n|V’> <^kn> = l(9n|V’>| 2 

We note the following alternate and useful form of this result. We have 

Prob(Q = g„|W) = IkJV’} I 2 = knlV’XV’kn) 

= ( q n \W\q n ) = EkmlPfn^km) 

m 

= Tr(P qn W) (6.64) 

Thus, for a physical system that is characterized by a density operator W = 
|0} ('0| or represented by a pure state characterized by a state vector |0), the 
probability of measuring a particular eigenvalue q n of an observable Q repre¬ 
sented by the operator Q is given by 

\{Qn\P)\ 2 (6.65) 

To see that this all makes sense relative to the standard definition of an average 
value we consider the quantity 

(a\Q\a) = average value in the state |a) (6.66) 
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(6.67) 


Now using the eigenvector/eigenvalue relation 

Q\qk) = qk\qk) 

which says that we can write as 

Q = Eftlft)(ftl (6.68) 

k 

we get 

H Q\a) = Y, ft {a I ft) (ft I «} = E ftl (ft I «) | 2 (6-69) 

k k 

which says that the average value in the state |a) is equal to the sum over all 
possible values (the eigenvalues) of the value(eigenvalue) times the probability 
of finding measuring that eigenvalue when in the state |a), which is clearly given 

by 

| (ft | a) | 2 (6.70) 

This corresponds to the standard definition of the average value. 

Thus, the connection between the mathematical object | (qk \ a) | 2 and the prob¬ 
ability of measuring the eigenvalue q n when in the state |ct) is now clear and 
derivable from the earlier postulates. 

We do not need to assume this as another postulate as is done in 
many texts. 

A Special Case: In general, any observable will exhibit a nonzero statistical 
dispersion in its measured value for most states. For the case of a discrete spec¬ 
trum, however, it is possible for all of the probability to reside in a single value. 

Suppose the state involved is an eigenvector. The observable represented by the 
operator Q takes on a unique value, say q$ (a non-degenerate eigenvalue) with 
probability = 1 in this state. 

This means that 

Prob(Q = q 0 \W ) = Y (q n \ W \q n ) 5 qoAn = (< 7 0 | W |g 0 ) = 1 (6.71) 

n 

Now, any W satisfies 

Tr(W 2 ) < 1 (6.72) 

or 

Tr(W 2 ) = £ (ftl W 2 \q n ) = £ (q n \ WIW\q n ) (6.73) 

n n 

= E E (ftl ^ km> (q m I w | q n ) 

n m 

= EEl(ftl^lft*>l 2 

n m 

- (ftl W |go) + rest of terms = 1 + rest of terms < 1 
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which says that 


(qn\W\q m ) = 6 n0 S m o (6.74) 

or all the other diagonal and non-diagonal matrix elements of W must vanish. 

Thus, the only state for which the observable takes on the value qo (non¬ 
degenerate) with probability = 1 is the pure state represented by the density 
operator W = |g 0 ) M- 

This state, whether described by the density operator W or the state vector | q 0 ) 
is called an eigenstate of Q. The observable has a definite value (probability = 
1) in an eigenstate and ONLY in an eigenstate. 

Case 2: Continuous Spectrum 

Now we let Q be a self-adjoint operator with a purely continuous spectrum, so 
that we can write 


Q = f q' W) W\ dq' where Q \q) = q \ q) (6.75) 

The eigenvectors are normalized using the Dirac delta-function, as we showed 
earlier 


W\q") = t{q'-q") 


(6.76) 


and we let 

h(q) dq = probability that the measured value of the observable 
represented by Q lies between q and q + dq 

Then, as in the discrete case, we have 

(9(q - Q)) = f 9 h(q') dq'= Prob(Q < q\W) (6.77) 

which gives the probability that Q < q. We also have 

(e{q-Q)) = Tr{W9{q-Q)) (6.78) 

and using the standard expansion of an operator 

r~ oo 

9(q-Q)= 0{q-q)\q){q\dq (6.79) 
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we get 


9{q - Q) = TV (w J2 0{q ~ q') \q') (q'\ dq'} (6.80) 

= II dq " {q " 1 ™ {IZ d{q ■ q) lq,) {ql lq,,) dq ') 

= [°° dq"(q"\W [°° 9{q-q')W)(q'\q")dq' 

J — oo J — oo 

X oo n oo 

dq”(q”\W / e{q-q')\ q ')5{q'- q ")dq' 

oo J — OO 

= r e{q-q')(q'\W\q')dq' 

J — OO 

= f q (q'\ W \q') dq' = Profo(Q < q\W ) 

J — OO 

This says that the probability density for the observable represented by Q within 
the ensemble characterized by W is 


h(q) = ^-Prob(Q<q\W) = (q\W\q) 
oq 


( 6 . 81 ) 


in general. For the special case of a pure state where W = \ip) (ip\ we have 

h(q) = |(<?| 0}| 2 ( 6 - 82 ) 


Later on we shall find that this is equivalent to a probability statement for the 
famous Schrodinger equation wave function. 

We now turn to the topic of quantum dynamics. 

As we have seen : 


States or density operators describe values or probabilities 
for measurable quantities at given times. 

We want to find : 


Equations of motion that determine how these values 
or probabilities of measurable quantities change with time. 

It is clear from our discussion, that, in the theory of quantum mechanics, a state 
only needs to be able specify how to calculate expectations values of measurable 
quantities. Therefore, we will assume that 

Equations of motion, at a minimum, only need to 
specify how expectation values change in time. 
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6.4. Quantum Pictures 


There are three ways that are commonly used in quantum mechanics to make 
expectation values depend on time: 

Schrodinger Picture 

1. states are represented by ket vectors that depend on time, | ip(t)) 

2. operators representing observables or measurable quantities are indepen¬ 
dent of time, Q 

We then get a time-dependent expectation value of the form 

(Q(t)) = (^(t)\Q\^(t)) (6.83) 

Heisenberg Picture 


1. operators representing observables or measurable quantities are dependent 
of time, Q(t) 

2. states are represented by ket vectors that do not depend on time, \if) 

We then get a time-dependent expectation value of the form 

(Q(t)) = (ip\Q(t)\i/j) (6.84) 


Interaction Picture 

It is a mixture of the Schrodinger and Heisenberg pictures that is appropriate 
for a very important class of problems. We will discuss it later after presenting 
the Schrodinger and Heisenberg pictures. 

All of these pictures must agree in the sense that they must all give the same 
time dependence for ( Q(t )). There is, after all, a unique real world out there!!! 

We will discuss these pictures in terms of state vectors first, mixing in state 
operator aspects as we proceed. 

We have been discussing the so-called formal structure of quantum mechanics. 
This structure is a fundamental part of the theory behind quantum mechanics, 
but it has very little physical content on its own. 

We cannot solve any physical problem with the formalism as it stands. We 
must first develop connections or correspondence rules that tell us the specific 
operators that actually represent particular dynamical variables. 
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The fundamental dynamical variables that we will be working with are posi¬ 
tion, linear momentum, angular momentum and energy. All such quantities 
are related to space-time symmetry transformations. As we proceed with our 
discussions, we will introduce any needed aspects of symmetry transformations 
and discuss further aspects of the subject later. 


6.5. Transformations of States and Observables 
The way it must be. 

Experimental evidence leads us to believe that the laws of nature are invariant 
under certain space-time symmetry transformations, including displacements in 
space and time, rotations and Lorentz boosts (relative velocity changes). 

For each such symmetry, both the state vectors and the observables must have 
transformations, i.e., 

IV>> - W) , Q - Q' (6.85) 

We will only use pure states for most of our discussions in this development 
since nothing new appears in more complex cases. 

What must be preserved in these transformations? 

1. If Q | q„) = q n \q n ), then Q' \q' n ) = q n \q' n ) 

Here we are assuming that the eigenvalues are unchanged since Q and Q' 
are the SAME observable represented in two frames of reference. Math¬ 
ematically operators representing the same dynamical variable must have 
the same spectrum. For example, in both frames the position operator will 
have identical continuous spectra in the range [-oo, oo] in each frame. 

2. If |0) = Z n c„ \q n ) where Q\q n ) = q n \q n ), then \ip') = Z n c ' n Wn) where 
Q' W n ) ~ Qn W n )- This actually follows from (1). 

Now, equivalent events observed in each frame must have the same probability. 
If this were not true, then some event might occur more often in one frame than 
it does in another frame, which makes no physical sense. 

This means that probabilities are equal or that 

K\ 2 = K| 2 or | (q n | 0} | 2 = | {q' n | 0') | 2 (6.86) 

We now present the mathematical formalism that characterizes this type of 
transformation. But first, a digression to cover two new mathematical topics. 


6.5.1. Antiunitary/Antilinear Operators 

If, for any vector |0), an operator T satisfies 

f (|0) + |0)) = f |0) + T |0) and Tc|0) = c*T\ip) (6.87) 


379 




then this type of operator is called an antilinear operator. For an antilinear 
operator T to have an inverse T -1 such that 

f- lr f = i = ff~ l (6.88) 

it is necessary and sufficient that for each vector \tjj) there is one and only one 
vector | <f>) such that \i[>) = T\(j>). This implies that T~ l is unique and antilinear. 

If an antilinear operator T has an inverse T^ 1 and if ||T|^>) || = || |0) || (preserves 
norm) for all | ip), then T is antiunitary. 

Assume T is antiunitary. Therefore, if 

\4>) = f\cj)) and \4>) = T\ip) (6.89) 

then we have yj> I '?/>) = (<f> \ ip)*. Now if T is antilinear, then T 2 is a linear operator 
and if T is antiunitary, then T 2 is a unitary operator. 


6.5.2. Wigner’s Theorem 

Any mapping of the vector space onto itself 


W) W = &\i>) and \<t>) I</>)' = t/ \<t>) (6.90) 

that preserves |(</>|V’)I can be implemented by an operator U that is unitary 
(linear) when (<// \^}') = (<j>\i/)} or antiunitary(antilinear) when ((/>' \ip') = (<j>\ip) . 

We can show that all such transformation operators of interest in quantum 
mechanics are linear operators. For example, let describe a displacement 
through a distance t. We know from experiment that this can be done as a series 
of two displacements of size t/2 and thus, we must have U(£) = U(£/2)U(£/2). 
Now the product of two antilinear operators is linear. Therefore, regardless of 
whether U(£/ 2) is linear or antilinear, U(£) is linear. There is nothing special, 
however, about the value £, i.e., we must also have U(£/ 2) = U(£/A)U(£/4i) which 
implies that U(£/2) is linear and so on. 

Operators in quantum mechanics are continuous, which means that they cannot 
change discontinuously from linear to antilinear as a function of £. This means 
that we need only consider continuous linear transformations in quantum me¬ 
chanics. 

Antilinear operators will appear later when we discuss discrete symmetries. 
Now, if the transformation rule for state vectors is | q' n ) = U\q n ) where 

Q I Qn) = Qn | q n ) and Q' \q n ) = q n \q n ) (6.91) 
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then we must have 


Q'U\q„) = q n U\q n ) or U 1 Q'U\q n ) = q n \q n ) (6.92) 

Therefore, 

(Q-U^Q'U) | q n ) = (q n - q n ) \q n ) = 0 (6.93) 

for all | q n ). Since the set {|g„}} is complete (the eigenvectors of a Hermitian 
operator), this result holds for any vector \ip), which can be constructed from 
the set {| q n )} - Therefore, we must have 

Q-U~ 1 Q'U = 0 (6.94) 

or 

Q -* Q' = UQ'tr- 1 (6.95) 

is the corresponding transformation rule for linear operators. 


6.5.3. The Transformation Operator and its Generator 

Let i be a continuous parameter. We consider a family of unitary operators 
U(t), with the properties 

17(0) = I and U(t i +1 2 ) = U(ti)U(t 2 ) (6.96) 

Transformations such as displacements, rotations and Lorentz boosts clearly 
satisfy these properties and so it make sense to require them in general. 


Now we consider infinitesimal t. We can then write the infinitesimal version of 
the unitary transformation as 


~ . , f dU 


t + 0(t 2 ) 


t =0 


Now all unitary operators must satisfy the unitarity condition 

UU ] = I for all t 

Therefore, we have to first order in t (an infinitesimal) 


II 

II 

<b 

1 + 

dU(t ) 

t + . . . 

\ i+ dU\t) 

t + . . . 

i 

. 

dt 

t= 0 

dt 

t= 0 

= /+ f 

dU(t ) 

+ diJ\t) 1 

t + . . . 



[ 

dt 

dt 

t= 0 


which implies that 


'dU(t) 

i dU\t) 

= 0 




dt 

dt 

t= 0 



(6.97) 


(6.98) 


(6.99) 


( 6 . 100 ) 
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If we let 

dU(t) 
dt 

then the condition (6.99) becomes 

-iH + (iH)* = 0 or H = H* (6.102) 

which says that H is a Hermitian operator. It is called the generator of the 
family of transformations U(t) because it determines these operators uniquely. 


= -iH 


t =o 


( 6 . 101 ) 


Now consider the property U(t i + t- 2 ) - U(t\)U(t- 2 ). Taking the appropriate 
partial derivative we have 


or 



1 1=0 




t=t 2 


- iHU(t 2 ) 


which can be written for arbitrary t as 


(6.103) 

(6.104) 


dU(t) 

l -r- 

dt 


HU(t ) 


(6.105) 


If H is not explicitly dependent on time , then this equation is satisfied by the 
unique solution 

= (6.106) 

Thus, the generator H of the infinitesimal transformation, determines the op¬ 
erator U(t ) = for a finite transformation. 


This is just Stone’s theorem, which we discussed earlier, but now derived in an 
alternative way. 


We will now approach the study of various pictures using simple methods and 
then repeat the process using symmetry operations, which will clearly show the 
power of using the latter approach and give us a deeper understanding about 
what is happening. 


6.6. The Schrodinger Picture 

The Schrodinger picture follows directly from the previous discussion of the 
U(t) operator. Suppose we have some physical system that is represented by 
the state vector |-0(O)) at time t = 0 and represented by the state vector | ip(t)) 
at time t. 

We ask this question. How are these state vectors related to each other? 
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We make the following assumptions: 

1. Every vector 10(0)) such that (^(O) | ip(0)) = || |0(O)) || = 1 represents a 
possible state at time t = 0. 

2. Every vector \ip(t)) such that {ip(t) \ ip(t)) = |||^(i))|| = 1 represents a 
possible state at time t = 0. 

3. Every bounded Hermitian operator represents an observable or measurable 
quantity. 

4. The properties of the physical system determine the state vectors to within 
a phase factor since \\e' la \ if) || = || | ip) ||. 

5. | ip(t)) is determined by 1 0 (0)). Now if |0(O) ) and |^(0)) represent two 

possible states at t, = 0 and | if{t)) and 1 4>{t)) represent the corresponding 
states at time t, then | (<^(0) | -0(0)} | 2 equals the probability of finding the 
system in the state represented by |<^(0)} given that the system is in the 
state (0(0)) at t = 0 and | ( <f>(t ) | 2 equals the probability of finding 

the system in the state represented by |0(f)} given that the system is in 
the state | ip(t)) at time t. 

6. It makes physical sense to assume that these two probabilities should be 
the same 

| (0(0) | ^(0)> | 2 = | (0( t ) 10(*)> | 2 

Wigner’s theorem then says that there exists a unitary, linear operator U(t) 
such that 

W0> = tf(0hK0)> (6.107) 

and an expression of the form 

\(a\U(t)\f3)\ 2 (6.108) 

gives the probability that the system is in state |a) at time t given that it was 
in state \/3) at time t = 0. 

This clearly agrees with our earlier assumption of the existence of such an op¬ 
erator and strengthens our belief that item (6) above is a valid assumption. 


We assume that this expression is a continuous function oft. As we have already 


showed (6.104), 

we then have U(t) satisfying the equation 




(6.109) 

or 

U(t ) = e~ iftt 

(6.110) 

and thus 


(6.111) 
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which implies the following equation of motion for the state vector 

= (6.112) 

or 

= (6.H3) 

which is the abstract form of the famous Schrodinger equation. We will derive 
the standard form of this equation later. The operator 17(f) = e~ lHt is called 
the time evolution operator for reasons that will become clear shortly. 

Finally, we can write a time-dependent expectation value as 

\m) = m \m) = e~ ifIt \m) ( 6 . 114 ) 

(Q(t)) = {ip(t)\Q | ip(t)) (6.115) 

This is the Schrodinger picture where state vectors change with time and oper¬ 
ators are constant in time. 

As we saw in the discussion above, using the Schrodinger picture depends on 
a full knowledge of the Hamiltonian operator H. However, in the Schrodinger 
picture, where we need to know H to solve the equation of motion for \if(t)), the 
equation of motion is such that we seem to need to know the complete solution 
for all time, \ip(t)} to deduce H. We are trapped in a circle. Put another way, 
the Schrodinger equation has no physical content unless we have an independent 
way to choose the Hamiltonian operator H. Before deriving a way to choose H 
(we will use a symmetry approach), we will look at the other pictures. 

We note that the Schrodinger picture is not the same as the Schrodinger equa¬ 
tion. The Schrodinger equation involves a mathematical object called the wave 
function which is one particular representation of the state vector, namely the 
position representation , as we shall see later. Thus, the Schrodinger equation is 
applicable only to Hamiltonians that describe operators dependent on external 
degrees of freedom like position and momentum. The Schrodinger picture, on 
the other hand, works with both internal and external degrees of freedom and 
can handle a much wider class of physical systems, as we will see. 

Digression: Alternative Approach to Unitary Translation Operators 

We now consider active translations of the state vector in space-time. Since the 
length of a state vector cannot change (always normalized to 1) in the stan¬ 
dard probability interpretation of quantum mechanics, active translations must 
be represented by unitary operators and correspond to rotations(no change of 
length) of the vectors in the Hilbert space. 
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We use the quantities 

ip a (r) = (f\a) (6.116) 

= probability amplitude for the state |a) to be found in the state |f) 

for this discussion. 

First, we consider translations in space. For the shifted amplitude we write 

Vv(h) = ipa(r-p) = Ur(p)i>a(r) (6.117) 

where 

a , a' label the state vectors 
f indicates spatial translations involved 
p is the displacement vector 

The relationship above follows from Figure 6.1 below (it gives the meaning of 
translation in space). 



Figure 6.1: Space Translation 


which implies that 

il> a '(r + px,t)=ip a (r,t ) (6.118) 

To determine the operator U r (p) explicitly, we have oriented the translation 
vector p parallel to the x-axis. We get 

ip a {f - px) = ip a (x- p,y,z) (6.119) 

d p 2 d 2 

= y , z) - p— 1 p a (x, y, z) + — Q^1p a (x, y,z)~ ... 

= e~ p ^4> a (x,y,z) 
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where we have used a Taylor expansion and defined the exponential operator 

by 


Pax = 1 


d 


_ 

/3 dx 2! dx 2 


( 6 . 120 ) 


For a translation in an arbitrary direction p, the generalization of this result is 
accomplished by the replacement 


( 6 . 121 ) 

OX 

so that 

ipa(r -p) = e~ pv ip a (r) = e~ lp ' plh ip a {f) (6.122) 

where we have used 

p=-ihv (6.123) 

Later we shall find that p is the linear momentum operator. Thus, we find that 

the spatial translation operator is given by 

U r {p) = e~ ipp/h (6.124) 

We will derive this result from first principles using symmetry arguments shortly. 

Time Displacements in Quantum Mechanics 

We now investigate the time displacement of a state function ip a (f,t ) by a time 
interval t as shown in Figure 6.2 below. 



Figure 6.2: Time Translation 


As before, we have 


■fi a '(r,t + T) =i> a (r,t) 


(6.125) 


We now represent this transformation with a unitary operator t/ t (r) such that 


il> a '(r,t) = t/ t (r)?/) a (r,t) = i/j a (r,t-T) (6.126) 
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We again make a Taylor expansion to get 

Ut(j)il> a {r,t) = ip a (r,t) + ^ T^$a(f,t) - ... (6.127) 

= e~ T ^ 1 p a {f,t) 


It follows that 

U t ( T ) = e“ T K = e iT ^ /h = e iTk ' h 

(6.128) 

where we have used 

- - d 



e = h = th¬ 
at 

(6.129) 


Later we shall find that E = H is the energy or Hamiltonian operator. 


We find that the time evolution operator for state vectors (kets) is given as 
follows 


r, t, a) - e lHr ^ h \r, t, a) = | r,t- r, a) 

(6.130) 

£ iHt/ h | t + r,a) = |f, t , a) 

(6.131) 

| f, t + r,a) = e~ l ^ T l h | f, t, a) 

(6.132) 

| r, t, a) = e~ l ^ T ^ h |r, 0, a) 

(6.133) 

\f,t,a) = e- i6t/h \r,0,a) 

(6.134) 


A result identical to our earlier derivation (6.110). 

6.7. The Heisenberg Picture 

As we saw earlier, we can think of the expectation value in two different ways, 
namely, 

\m)=u(t)\m)=e- iAt \m) ( 6 - 135 ) 

<Q(f)> = (m\Q h Pit)) = IV'(O)) (6.136) 

= (V’(O)lQ(blV’(O)) 

where 

Q(t) = U(t)^QU(t) = U\t)Q(0)U(t) = e ikt Qe- ikt (6.137) 

This implies that in the Heisenberg picture operators change with time and 
states are constant in time. 

Now U(t) is a unitary, linear operator. This means that the transformation 
preserves these properties: 

1. Q(0) bounded -*• Q(t) bounded 
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2. Q(0) Hermitian -*■ Q(t ) Hermitian 

3. Q(0) positive -*■ Q{t) positive 

4. Q(0) and Q(t) have the same spectrum 

In addition, if the spectral decomposition of 0(0) give the projection operators 
E x ( 0), then the spectral decomposition of 0(0) give the projection operators 
Exit) = e ikt E x { 0)e~ iAt . 

This follows from: 


/ oo ^ 

xd{4\E x (t)\4>) (6.138) 

oo 

= M e^ t O(0)e-^ i W) = (Ht)\ 0(0) | m) 

= [°° xdm^E^m)) = xd(cl>\e i6t E x (0)e- i6t \^) 

J — oo J — oo 

For a function F of Q(t) we have 

X oo 

F(x)d(4\E x (t)\1>) (6.139) 

oo 

X oo - 

F(x)d( ( / ) \e im E x (0)e- iHt \xf) 

oo 

/ oo 

F{ X )dm)\E x mm) 

oo 

= (m HQm \m) = (<t>\ e iftt F(Qm e - iflt w 


or 

F{Q{t)) = e idt F(0(0))e-^ (6.140) 

If {Qi(0)} represents a complete set of mutually commuting Hermitian oper¬ 
ators, then {Qi(t)} is also a complete set of mutually commuting Hermitian 
operators. In addition, 

^({4(01) = e i&t F{{Qii 0)})e~ i6t (6.141) 

and all algebraic relations between non-commuting operators are preserved in 
time. 

Since [/(Q, Q] = 0 we have 

U\t)HU(t ) = e im He~ im = H (6.142) 

which implies that FI , the Hamiltonian operator is a constant of the motion (in 
time). 
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6.8. Interaction Picture 

We have derived the equation 

= H(t)\il>(t)) (6.143) 

Now let us assume that H can be split up as follows: 

H(t) = H 0 + V(t) (6.144) 

where H 0 is independent of t. This will be possible in many real physical 
systems. Note that the equation 

(6.145) 

does not have the simple solution 

U(t) = e~ im (6.146) 

in this case. 

We now define a new state vector by the relation 

\Mt)) = e i6ot W)) (6.147) 


Taking derivatives we get 

4 I Mt)) = 4(e^ | m)) = -e^Ho | Mt)) + e^Hj t \m) (6-148) 

= -e^ ot H 0 | Mt)) + e iflot H = e lilot V\${t)) 

= e ikot Ve~ iflot e iflot \^{t)) 

or 

= Vdt) \ipi(t )) 

where we have defined 

Vr(f) = e iftot Ve~ i6ot 

We then have 

(Q(t)) = ^WlQl^W) = (^(l)|e^ ot Qe-^ t |V'/W) (6-151) 

This says that in the interaction picture both the state vectors and the operators 
are dependent on time. Their time development, however, depends on differ¬ 
ent parts of H. The state vector time development depends on Vj(t) and the 


(6.149) 

(6.150) 
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operator time development depends on Hq. It is, in some sense, intermediate 
between the Schrodinger and Heisenberg picture. 

The question still remains, however, how do we find H ? 

We now turn to a more general approach based on symmetries to get a handle 
on how to deal with this problem. 


6.9. Symmetries of Space-Time 

Since we are only considering non-relativistic quantum mechanics at this stage, 
we will restrict our attention to velocities that are small compared to the speed 
of light. 

In this case, the set of all displacements in space-time, rotations in space and 
Galilean boosts(Lorentz boosts in the low velocity limit) can be represented by 
transformations where 


x -»■ x' = Mi + a + vt , t' = t + s (6.152) 

where K is a rotation of the real 3-vectors x in 3-dimensional space, a is a real 
3-vector that specifies space translations, v is a real 3-vector that represents the 
velocity of a moving coordinate system and specifies the Galilean transforma¬ 
tions, and the real number s specifies the time translation. 


R can be thought of as a 3 x 3 matrix such that under a pure rotation 


Sh 

O 

& 

II 

Xj — RjiXi 

i 

( 6 . 153 ) 

'*i\ /Jin 

x' 2 = i?21 
^ 3 / \-R3i 

Rl2 Rl3^ f Xi^ 
R22 R23 X2 

R32 R33 \a; 3 y 

( 6 . 154 ) 


For example, a rotation by angle 9 about the x 3 (or z)-axis is given by 


R 3 (0) 



sinf? 
cos 9 
0 


0 \ 

0 

1 / 


or 

x\ = x\ cos 9 + x 2 sin 9 
x' 2 = -Xi sin 9 + x 2 cos 9 
x' 3 = x 3 

which corresponds to Figure 6.3 below. 


(6.155) 


(6.156) 
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Figure 6.3: Rotation about z-axis 


If we let Ti and T 2 be two such transformations, then T 1 T 2 is the transformation 
corresponding to X 2 followed by T\. 

A set of transformations forms a group when: 

1. The product of two transformation in the group is also a transformation 
in the group. 

2. The product is associative T 3 (T 2 Ti ) = ( T 3 T 2 T 3 ). 

3. An identity transformation To exists such that x -*■ x and t -*■ t or TqT = 
T = TTq for all transformations T. 

4. An inverse transformation T _1 exists for every transformation such that 
T~ 1 T = T 0 =TT~ 1 . 

A subset of transformations T(r) depending on a real parameter r is called a 
one-parameter subgroup if To = the identity and T(ri + t 2 ) = T(ti)T(t 2 ). Rota¬ 
tions about a /ixed axis, as we saw above, form a one-parameter subgroup where 
the parameter is the angle of rotation. 

Since products of the transformations from the three one-parameter subgroups, 
which correspond to rotations about the 1-, 2-, and 3-axes separately, include 
all possible rotations, there are three independent parameters (three angles) de¬ 
scribing all rotations (remember the Euler angles from Classical Mechanics). 

The total group of transformations has 10 parameters 

1. rotations = 3 (3 angles) 

2. space translations = 3 (3 components of a) 

3. Galilean boosts = 3 (3 components of v) 
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4. time translation = 1 (parameter s) 

Each of these ten parameters corresponds to a one-parameter subgroup. A 
ten-parameter group transformation is then defined by the product of ten one- 
parameter subgroup transformations. 

This means that in our discussions we need only consider the properties of 
one-parameter subgroup transformations in order to understand general group 
transformations. 

In our earlier discussion we have already covered part of this topic when we 
derived the time evolution operator [/(f), which clearly corresponds to the time 
translation transformation. 


Our earlier results tell us that for each of the ten transformations there exists a 
linear, unitary operator U(r) on the state vectors and observables such that 

|V>) - |V/} = U(t ) |V>) and Q^Q' = U(t)QU~\t) (6.157) 

where U(t) takes the general form 

U( T ) = e iTd (6.158) 

and G = a Hermitian operator = the generator of the transformation. 


The time evolution operator we derived earlier is a good example: 
if | ip{t)) = state vector at time t, then for t' = t + s 

m'))=e iT »\m) 


(6.159) 


6.10. Generators of the Group Transformations 


In general we can write 


10 


..Ku, 


U(s^) = n 

M=1 


(6.160) 


where the different s M represent the ten parameters defining the group transfor¬ 
mation represented by the operator [/(s M ) and 


K M = K' = the Hermitian generators (there are 10) 


(6.161) 


Now, let all the parameters become infinitesimally small so that we get infinites¬ 
imal unitary operators of the form (expanding the exponentials) 


10 „ 10 


(6.162) 


1 


fi=l 
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or to first order in the parameters 


10 

t/=/ + i£v^ (6-163) 

/2=1 

Note that the inverse transformation corresponding to e ls ^ Kt ‘ is e~ ls ^ K >^. 

We now construct the product transformation consisting of two infinitesimal 
transformations followed by their inverses to get 

e ieK» e ieK„ e -icK„ e -ieK„ = f + ^K V ,K„\ + 0 ( £ 3 ) 

where 

[A„, Kfj] = A„ K t j, - K/jKv = commutator 
The algebraic steps involved are shown below. 

g-* e A* e _i<r ^ 

= (/ + ieK fl - ^e 2 A 2 ) [i + ieX - h 2 A 2 ) x 
(/ - uK, - h 2 A 2 ) (J - iek v - ^e 2 A 2 ) 

= (/ + *e(£ M + A„) - e 2 k IJ X - h 2 (A 2 + A' 2 ) j x 

(/ - ie(K^ + X) - e 2 XX ~ \/{Kl + A' 2 ) j 

= (/ + ie(A„ + X) - e 2 k l ,X ~ + X) 

~i<X + #0 - - h 2 (AT 2 + A 2 ) 

+ (ie(A M + k v ))(-ie(k M + X)) 

= I- e 2 A 2 - e 2 A 2 - 2e 2 A / , A ! , + e 2 A 2 + e 2 A 2 + e 2 k tx k v + e 2 K v X 

= i + e 2 XX-e 2 k l _ l X 

= J + e 2 [A J/ , A m ] + 0(e 3 ) 

We have expanded to 2 nd order in e in this derivation to explicitly see all the 
higher order terms cancel out. In general, this is not necessary. 

But remember that these transformations are part of a group and therefore the 
product of the four transformations must also be a transformation W in the same 
group. Actually, it only needs to be a member of the group to within an arbitrary 


(6.164) 

(6.165) 
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phase factor of the form e* a (remember that the unitary transformations only 
need to preserve | (<j> \ ip) |). We then have 

e ia W = i + e 2 [k u ,k^\ (6.166) 


(1 + ia) + i £ j = i + e 2 [k„, k,,] 

iotl + i s^k^ — e [-/!„, -K/^] 


where we have used 


(6.167) 

(6.168) 


= (6.169) 

and expanded all the exponentials on the left-hand side of the equation to first 
order. 


Therefore, the most general mathematical statement that we can make based 
on this result is that the commutator must take the form 

[k v , kj = i Y, c»vk„ + ib^i (6.170) 

A 

where the real numbers ck = the structure constants (of the group) and the term 
involving the identity operator just corresponds to the existence of an arbitrary 
phase factor. The structure factors and the bOs are completely determined by 
the group algebra as we shall see. 


By convention , we define the transformations as follows: 

1. Rotation about the a-axis (a = 1,2,3) 

x -*■ R a (0 Q )x 

corresponds to the group operator 

ij = e -ie a 3 a 

where J a = generators (a = 1,2,3) 

2. Displacement along the a-axis (a = 1,2,3) 

x a -+ x a + a a 

corresponds to the group operator 

t/ = e -ia Q P Q 

where P a = generators (a = 1,2,3) 


(6.171) 


(6.172) 


(6.173) 


(6.174) 
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3. Velocity boost along the a-axis (a = 1,2,3) 

x a -*■ x Q + v a t (6.175) 

corresponds to the group operator 

U = e iv °‘G°‘ (6.176) 

where G a = generators (a = 1,2,3) 

4. Time displacement 

t —*■ t + s (6.177) 

corresponds to the group operator 

U = e isfl (6.178) 

where H = generator = Hamiltonian 

6.11. Commutators and Identities 

Initially, we will ignore the extra / term in the equation(6.169) below 

[it, K„] = i £ + ib^I (6.179) 

A 

in our discussions and then include its effect (if any) later on. 

We can determine some of the commutators using physical arguments as follows: 

1 . space displacements along different axes are independent of each other 
which implies that 

\P a ,Pp\=0 (6.180) 

2 . space displacements are independent of time displacements which implies 

that 

[P a ,H]=0 (6.181) 

3. velocity boosts along different axes are independent of each other which 
implies that 

[G a ,G p ] = 0 (6.182) 

4. rotations are independent of time displacements which implies that 

\J a ,H] = 0 (6.183) 

5. space displacements and velocity boosts along a given axis are independent 

of rotations about that axis 

[j a ,P a ]=0=[j a ,G a ] (6.184) 
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6. obviously 


(6.185) 


[H,H] = 0 

This leaves us to consider these remaining unknown commutators: 

[j a ,P 0 l[G a ,HUj a JeUG a ,j p UG a ,P 0 ] (6.186) 

Let us consider [G±,H] first. 

The general procedure is as follows. We write down a product of four operators 
consisting of the product of two operators representing a velocity boost in the 
1-direction and a time translation and their inverses (as we did earlier). 


Now these four successive transformation correspond to these changes of the 
coordinates: 

(xi,X 2 ,X 3 ,t) -*• (a;i - et,X 2 ,X 3 ,t) Lorentz boost at time t 

-*• ( x\ - et, X2, X3, t - e) Time translation - only affects t 
-*• ( X\ - et + e(t - e), X2, X3,t - e) Lorentz boost at time t-e 
-»■ (xi - e 2 , X2, X3, t) Time translation - only affects t 

This last result just corresponds to a space displacement -e 2 along the 1-axis, 
so equating the product of four transformations to a space translation, we have 
the result 

giellgieGi £ -ieH e ~ieG 1 

= (7 + ieH )(7 + ieGi)(7 - ieH )(7 - ie(h) 

= I + e 2 \G ll H] + 0(e 3 ) 

= e -*(-e 2 )A = f + ie 2p 1 

so that we find the result for the commutator 

[Gi, 77] = iPi (6.187) 

In general, we get (using this same procedure) 

[G ai H] = iP a (6.188) 

So we have determined one of the unknown commutators. 


Now let us determine [ J a , jg] using the same type of procedure. For a rotation 
we saw that 


where 


Xj 


'y' RjkXk 

k 


(6.189) 



0 

0 > 


Ri(0) = 

0 

cos 8 

sin 9 

(6.190) 


lo 

- sin 9 

cos 9) 
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For small 9 we have 


R 2 (0) 


'’cos 6 0 -sin#' 

0 1 0 
^sin# 0 cos# y 


( cos 9 sin 9 O' 
- sin 9 cos 9 0 
0 0 l) 


R a (6) = I - iOM a 


(6.191) 


(6.192) 


(6.193) 


where M a is determined by expanding the exponential to 1 st order as 



(0 

0 

O' 

Mr(0) = 

0 

0 

i 



-i 

0/ 


(0 

0 


M 2 {9) = 

0 

0 

0 


V* 

0 

0 


'0 

i 

0\ 

M 3 (9) = 

-i 

0 

0 


lo 

0 

0/ 


Then we have 

R 2 (-e)Mi(-e)M 2 (e)lRi(e) 

= I + e 2 [Mi, M 2 \ = J + ie 2 M 3 

= K 3 (-e 2 ) 


(6.194) 

(6.195) 

(6.196) 

(6.197) 

(6.198) 


or the product of four rotations is equivalent to a single rotation, which implies 
that 

[Ji,J 2 ] = zJ 3 (6.199) 

and in general 

[Ja,Jp] = *£a/3 7 ^7 (6.200) 

where we are using the Einstein summation convention for repeated indices and 
e a p 7 is the antisymmetric permutation symbol we introduced earlier with the 
properties 


^a/5 7 ' 


1 

-1 

0 


if a /?7 is an even permutation of 123 
if a /?7 is an odd permutation of 123 
if any two indices are the same 


( 6 . 201 ) 
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Finally, we consider 

gieG 2 e ie ^i e ~ ie G 2 e ~ ie Ji 

= (/ + »eG 2 )(/ + icG- t )(i- ieG 2 ){l - ieG i) 

= /+ e 2 [J 1 ,G 2 ] + 0(e 3 ) 


( 6 . 202 ) 


This involves a rotation by e about the 1-axis and a velocity boost of e along 
the 2-axis. This product transformation changes the coordinates as follows: 

(xi,x 2 , 23 ) -*■ ( 21 , x 2 cose + 23 sine, -x 2 sine + 23 cose) 

-*■ (x\,x 2 cose + a :3 sine - et , -22 sine + 23 cose) 

-*• (x\,x 2 + et cos e, 23 + etsine ) 

-» ( 21 , 22 , 2:3 + e 2 t) to 2 nd order in e 


This is the same as 

e ie2d * =I + ie 2 G 3 

(6.203) 

Thus, we have 

[Ji,G 2 ] = iG s 

(6.204) 

or in general, 

[ Jo-, (jr^ = i'€otf3 r yGry 

(6.205) 

In a similar way, 

[«/cn -f/3] = 'i'C’Oif.S'yP-y 

(6.206) 

Summarizing so far 

we have 


[P a ,Pp] = 

0 , [P a ,H] = 0 , [G a ,G p ] = 0 , [J a ,H]= 0 

(6.207) 


[J a ,P a ] = 0, [J a ,G a ] = 0, [H,H] = 0 

(6.208) 


[^-*cn -^J — iPot ? [ Jot 1 J(3 ] — 

(6.209) 


= itocfi'yG-y , [t/a, Pfi ] = S'yPy 

( 6 . 210 ) 


Before figuring out the last unknown commutator [G a ,P/j], we need to see 
whether the additional / term has an effect on any of the commutators we have 
already determined. 


There are two relations, which are true for all commutators, that we will need 
to use. 


Commutators are antisymmetric so that 

[A,B] = -[B,A] (6.211) 

and they satisfy Jacobi’s identity which is 

[[A,B],C] = [[C,B],A] + [[A,C],B] (6.212) 
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These identities, as we shall see, limit the possible multiples of I that can be 
present. So we now assume that each commutator has an additional multiple 
of/. 

Thus, using [ A, A] = *A - cl, where we have added a multiple of I. We then 
have 


AA, A] = [[A, A], A] + [d, A] = [[A, A], A] 

= [[A, A], A] + [[A, A], A] 

Now, [A, A] = 0 and [A, A] = 0, therefore [A, A] = 0 and no additional 
multiple of / is needed in this commutator. Everything is consistent without it! 

In a similar manner we can show that 

[A,A] = 0, [A,A] = 0, [Got, Gp] = 0 , [J a ,H] = 0 (6.213) 

so that no additional multiple of / is needed in any of these commutators. 

Since [J Q , Jp] = -[Jp, J a \, if the commutator [An As] is going to contain an 
extra multiple of I, then the constant must be antisymmetric also. Therefore 
we must have 

[An As] — 'yJ'y i’Cap'ybjI (6.214) 

If we redefine J a -*■ J a + b a I, then we get the original commutator back 

[At? A] ~ (6.215) 

This change of definition implies that the transformation operator becomes 

U a {0) = e~ wJa e - ieJa e- m °‘ (6.216) 

and thus | ip') = U\ip) changes to e lSba \ip') = U\ip). Since overall phase factors 
for the state vector do not change any physics we can ignore the extra / terms 
in this case. They do not change any real physics content! 

In a similar manner, we show that 

[Gen P~\ ~ iPot ; [An G^] — id a ^G^ , [An Pft ] — P~f (6.217) 

so that no additional multiple of / is needed in any of these commutators. 

Finally, we are left to consider the unknown commutator [G Q , P^\. Now using 
the fact that [cl, A] = 0 we have 

AGs, A] = [[A,G 2 ], A] + [cl, Pi] = [[A,G 2 ], A] (6.218) 
= [[A,g 2 ],A] + [[A,A],g 2 ] = [[A,g 2 ],A] 
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which has the solution 

[G a ,Pp\ = 0 at/3 

(6.219) 

In addition, we have 



i[G 3 ,P 3 ] 

= [[Ji,G 2 \,P 3 \ + [cI,P 3 ] = [[Ji,G 2 ],P 3 ] 

= [[P 3 ,G 2 ],J 1 ] + [[J 1 ,P 3 ],G 2 ] 

= -i[p2, G 2 ] 

(6.220) 

or 

[G 3 ,P 3 ] = [G 2 ,P 2 ] 

(6.221) 

and, in general 

[G a ,P a ] = [Gp,Pp\ 

(6.222) 

The only way to satisfy all of these commutators is to have the result 



[d a ,Pp\ = s aP Mi 

(6.223) 


The value of M is undetermined. It cannot be eliminated by including mul¬ 
tiples of / in any of the other commutators. It must have some real physical 
significance and we will identify it shortly. 


6.12. Identification of Operators with Observables 

We now use the dynamics of a free particle, which is a physical system that is 
invariant under the Galilei group of space-time transformations, to identify the 
operators representing the dynamical variables or observables in that case. This 
section follows and expands on the work of Jordan(1975). 

We assume that there exists a position operator (as we discussed earlier) 

Q = (&,&,&») (6.224) 

(boldface = multi-component or vector operator) where 

Q a |i) = x a \x) , a = 1,2,3 (6.225) 

The position operator has an unbounded (—oo < x a < oo), continuous spectrum. 
The three operators Q a have a common set of eigenvectors and, thus, they 
commute 

[Qa,Qt 3] = 0 (6.226) 

We now assume that there also exists a velocity operator 

V=(Vi,V 2 ,V 3 ) (6.227) 

such that we can make the following statement about expectation values (this 
is an assumption) 

|(Q) = (V) (6.228) 
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for any state vector. 


It is important to note that each treatment of quantum mechan¬ 
ics must make some assumption at this point. Although, they might 
outwardly look like different assumptions, they clearly must be equiv¬ 
alent since they result in the same theory with the same predictions. 

For a pure state | ip(t)) we would then have 

(m\v\m) = j t ((m\Q\m)) (6-229) 

= Q|V’(f)) + (V’(*)lQ ^lV’(i)) 

Now a time displacement corresponds to t -*• t' = t + s. To use this transforma¬ 
tion, we first have to figure out a rule for the transformation of the ket vector 
argument. When we represent the abstract vector \ijj) as a function of space- 
time we must be very careful about defining its properties. We derived this 
result earlier, but it is so important that we do it again here. 

Given |0), we have seen that 

ip(x,t) = (x,t \ip) (6.230) 

Operating on the abstract vector with the time translation operator e lsH \ip) 
then gives 

(x,t\e is6 \*f) (6.231) 

Now 

e~ is& \x,t) = \x,t-s) (6.232) 

and therefore we get 

(x, t\ e* s ^ | ip) = (x, t - s | if) = ip(x, t- s) (6.233) 

or 

\m)^e u6 \m) = m-s)) ( 6 . 234 ) 

Here we have taken the so-called active point of view that the state is translated 
relative to a fixed coordinate system. 


Now, we let s = t to get 



0 s 

11 

% 

t 

5 

(6.235) 

or 



and 

\m)=e~ uA \m) 

d 

(6.236) 



(6.237) 
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as we had already found in an earlier discussion. Using this result in (6.228) we 
then find 

im\v\m)\m)- * (m\ q £ m *)> ( 6 . 238 ) 

= i(ip(t)\[H,Q]\ip(t)) 

which says that 

V = i[H, Q] (6.239) 

is a valid velocity operator for a free particle. 

However, since we do not know H, we are still stuck at the starting line. 

Now a space displacement x -*■ x! = x + a corresponds to 

|$) -*■ \x') = \x) = |x + a) (6.240) 

Again, in this active point of view, the state is displaced relative to a fixed 
coordinate system. 


We then have, from our earlier discussion 



Q - Q' = QeiE„o„p a 

(6.241) 

where 

Q' a \x)' = x a \x)' or Q' a \x + a) = x a \x + a) 

(6.242) 

Since 

Q a \x) = x a \x) or Q a |i + 2} = (x a + a a ) \x + 2) 

(6.243) 

we must have 

(Q a - a a I) \x + 2) = x a \x + 2} 

(6.244) 

which says that 

Qa~ Qa~ ad or Q' = Q - 2/ 

(6.245) 

Now we need to work out how to use an operator of the form 



e iuA£ e -iuA 

(6.246) 


There are two ways to do this. The first way recognizes that it is the solution 
of the l s *-order linear differential equation 

i—e iuA Be~ iuA = - e iuA ABe~ iuA + e iuA BAe~ iuA (6.247) 

du 

= e iuA [A,B]e~ iuA = [e iuA Be~ luA , i] 

with boundary condition 

e iuA Be~ iuA = B at u = 0 (6.248) 
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A second way is to expand the exponentials in power series and regroup the 
terms. In both cases, we get 

v 2 

e iuA Be -iuA = B _ iu [ B , ^ _ —[[B, A],A] - ... (6.249) 

Using this last relation we can find useful results like 

e luh he iuh = J 3 - iu[ J 3 , Ji] - ^[[4 Jil Ji] - • • ■ (6.250) 

u 2 , - 

= J 3 - iu(iJ 2 ) - — [i^ 2 , Ji] - ■ • • 

? ? a 2 ~ m 3 ? u 4 i 

- J 3 + uJ 2 - —J 3 - — J 2 + J 3 - ... 



= J 3 cos u+ J -2 sin u 


Now we finally arrive at the result we are looking for 

Q' a = Q a -aJ = e~ i ^ a i ,p ‘ > Q a e i ^ a > >p » (6.251) 

— Qa f , P(3 ] + • • . 

P 


which implies that 

[Q a ,Pp]=iS a pI (6.252) 

This solution, not only satisfies the equation to l st -order as above, but implies 
that all higher order terms are explicitly equal to zero. Thus, the solution is 
exact. 


This equation is one of the most important results in the theory of 
quantum mechanics. 

Now we continue the derivation of that last elusive commutator. 


A rotation through an infinitesimal angle 9 about an axis along the unit vector 
n has the effect 

x-*■ x' = x + 0nx.x (6.253) 

The corresponding transformation of the position eigenvectors is 

\x)^\x') = e- ieil -~ 3 \x) (6.254) 


and for the position operator 

&,-<& = e- ieA3 Q a e ie&rj (6.255) 

= Qa ~ i0[ n • J, Q a ] + O{0 2 ) 
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We note that \x') = an eigenvector of Q' and |J) = an eigenvector of Q (different 
eigenvalues), but they are the same vector! (think about rotations of the axes). 

Now as before (6.241) and (6.242), 

Q' a IS)' = x a \x)' (6.256) 

Q a |i'} = x' a \x') = (x + 8n x x) a \x') (6.257) 

= (Q' + flnxQ')^}' 

Now since the vectors \x') - \x)' are a complete set, we must have 

Q = Q' + (9nxQ' (6.258) 

or inverting this expression to l st -order we have 

Q' = Q _ 6>n x Q (6.259) 

Therefore, we find 

[n- J, Q a ] = -in x Q (6.260) 

For an arbitrary unit vector u this says that 

[n • J, u • Q q ] = -iu • (n x Q) = j(n x u) • Q (6.261) 

or 

[• A> ■ Q/?] — ^ * Q — ^(^a/37^7) * Q — ( 6 . 262 ) 

We note that this result is not only true for the components of the position 
operator, but it is true for the components of any vector operator A. 

[j on Ap\=ie a p 1 A 1 (6.263) 

In a similar manner, since G generates a displacement in velocity space and we 
have 

V' = V - vl = e”' e Ve _i5 ' 6 (6.264) 

This is the same way the Q and P operators behaved earlier. Now the unitary 
operator U(v) = e zv ' G describes the instantaneous {t = 0) effect of a transforma¬ 
tion to a frame of reference moving at velocity v with respect to the original 
frame. It affects the V operator as in (6.263). Due to its instantaneous nature 
we must also have 

f/QtT 1 = Q or [G a ,Q 0 ] = 0 (6.265) 

Now, Q, the position operator, is clearly identified with an observable of the 
physical state. After much work we have determined the commutators of Q 
with all the symmetry generators of the Galilei group. We now have 

[G a ,Q f 3 ]=0 , [G a ,Pp] = i5 a pMI , [Q a ,P 0 ] = iS a0 i (6.266) 
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A possible solution of these equations is 

G a = MQ a (6.267) 

We cannot show this is a unique solution. 

We note, however, at this point that it is certainly true (using the commutators 
in (6.265)) that 

[G - MQ, P] = 0 and [G - MQ, Q] = 0 (6.268) 

We must now turn to special cases in order to learn more about the physical 
meaning of the generators. 

Before discussing the special cases, we must again digress to review some math¬ 
ematics discussed earlier and also cover some new mathematical ideas that we 
will need. 

A subspace M reduces a linear operator A if A \ij)) is in M for every |-0) in M 
and A\cj>) is in M 1 for every | <f>) in M 1 . 

A set of operators is reducible if there is a subspace, other than the whole space 
or the subspace containing only the null vector, which reduces every operator 
in the set. Otherwise, we say that the set is irreducible. 

A subspace M is invariant under a set of operators if A\if) is in M for every 
operator in the set and every vector in M. 

Thus, a subspace M reduces a set of operators if and only if M and M 1 are 
invariant under the set of operators. 

A set of operators is symmetric if A i is in the set for every operator A in the 
set. 

If a subspace is invariant under a symmetric set of operators, then it reduces 
the set of operators. 

Schur’s Lemma 

A symmetric set of bounded or Hermitian operators is irreducible if and only if 
multiples of I are the only bounded operators which commute with all operators 
in the set. 

Example 0: The commutator [Q a ,Qp] = 0 says that the set of operators 
{Q a , a = 1,2,3} is a complete set of commuting operators. Since [Qa,Pp\ = 
iS a pI any function of Q a that commutes with the P a must be a multiple of I. 
Therefore, the set {Q\, Q 2 , Q 3 , A, P 2 , P 3 } is irreducible. 
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In other words, if an operator commutes with the Q a , then it is not a function 
of P a since [ Q a ,P/ 3 \ ± 0. If an operator commutes with the P a , then it is not 
a function of Q a , for the same reason. If an operator is independent of the 
set {QonPp} and there are no internal(not dependent on {QonPp}) degrees of 
freedom, then it must be a multiple of I. 


Example 1 : Free Particle - no internal degrees of freedom 

Now, as we stated earlier (6.267) that 

[G - MQ, P] = 0 and [G - MQ, Q] = 0 (6.269) 

and therefore 

G-MQ= multiple of I (6.270) 

or 

G a = MQ a + cj (6.271) 

But G a is a component of a vector operator and therefore it must satisfy 

\Jon Gp\ - ie a fj 1 G 1 (6.272) 

The first term MQ a satisfies this relation, but the c a I term cannot unless we 
choose c a = 0. 


Therefore, we do not have any extra multiples of / and we find that 

G a = MQ a 

when there are no internal degrees of freedom. 

In a similar manner, we can show that 

[J - Q x P,P] = 0 and [J - Q x P, Q] = 0 
which then implies that 


J - Q x P = cl 


But since 


[</ a , J /}] — iCap-yJ'y 

again we are forced to choose c = 0 and thus 

J = Q x P 

when there are no internal degrees of freedom. 

The remaining generator we need to identify is H. It must satisfy 

[Ga, H] = iP a [MQa, H] = iP a - [Q a ,H] = ij^ 


(6.273) 


(6.274) 

(6.275) 

(6.276) 

(6.277) 


(6.278) 
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A solution of this equation is given by 

p.p p2 


as can be seen below 
1 


Q a , -P 

^ ’ 2 M 


H = 




2M 2 M 


(6.279) 




(6.280) 


= Y.iQotPp - PpQa) = ^7 Y,(QaPpPp - PpPpQa) 


2 M 


= ^ 'EiQaPpPp ~ Pp(Q a Pp - iSccpi)) 

= ^ Y,(Q»PpPp - PpQaP/3 + iSotpPp) 

= Y.iQaPpPp ~ (Q a P/3 ~ i5 a pI)Pp + i^apPp) 

= — V 2i5 a s Pa = — 

2My p p M 

This result implies that H - P ■ P/2M commutes with Q and since [P,P] = 0, 
it also commutes with P. Therefore, it is a multiple of I, and we find 


H 

2 M 

= EqI , Pq = constant 

(6.281) 

or 





. P.P 


(6.282) 

H= + EqI 

2 M 

Now, earlier we found that 





<o> 

p 

te!> 

II 

(6.283) 

which now implies that 





r. i . „i 

iPn, 


[Qa,H] = 

Q a , —P 

n* ’ 2 m \ 


(6.284) 

Thus, 





Pa 

P 

(6.285) 

v a 

= yry Or V = 
M 

“ M 

Summarizing, we have found that 



P = MV , II : 

= ^V-V + E 0 , J = Q x MV 

(6.286) 

where 




V = the velocity operator and Q = the position operator 

(6.287) 
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This implies, since we are talking about a free particle, that 


M must be proportional to the mass of the free particle 
i.e., suppose that 

M = constant x mass = /3m (6.288) 

then we must have 

P a = constant x linear momentum = f3p a 
H = constant x energy = /3E (6.289) 

J a = constant x angular momentum = j3j a 

and all the relations (6.285) 

P = MV , H = ^V-V + E 0 , J = Q x MV (6.290) 

still hold. 

Now also remember that 

[Qa,Pf}] = iS af )i (6.291) 

which then implies that 

[QonPp] = i-^SapI (6.292) 

This says that 1//3 must have the units Joule-Sec. With hindsight as to later 
developments, we will now choose 

h=- (6.293) 

and thus we finally obtain 

[Qa,Pp] = ih5 a pI (6.294) 

At this point, we do not know the numerical value of h (any value will do at 
this stage of our development). It can only be determined by experiment. Later 
we shall see that it is 

h=— (6.295) 

2n 

where h = Planck’s constant = 6.62 x 10 -34 Joule-Sec. h must be determined 
from experiment since it sets the scale of all quantum phenomena and the scale 
factors in physical theories cannot be known a priori (even though Kant thought 
just the opposite was true). 

Example 2 : Free Particle - with Spin 

Internal degrees of freedom are, by definition, independent of the center of mass 
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degrees of freedom (Example 1). This means they are represented by operators 
that are independent of both Q and P, or that they are represented by opera¬ 
tors that commute with both Q and P. 

The set of operators {Q, P} is not irreducible in this case since an operator that 
commutes with the set may still be a function of the operators corresponding 
to the internal degrees of freedom. 

The spin S is defined to be an internal contribution to the total angular mo¬ 
mentum of the system. Therefore, we must modify the operator representing 
angular momentum to be 

J = Q x P + S (6.296) 

with [Q, S] = 0 = [P, S]. 


We will study spin in great detail later in this book. 

For now, let us see what we can say just based on the fact that spin is an angular 
momentum. This means that the S a must have the same commutators among 
themselves as the J a . 

[§ a ,§p] = ie a fh§-y (6.297) 

Earlier we found the equation 

\G a ,Pp\ = i5 a pMI (6.298) 

is satisfied by G = M Q and 

[G - MQ, P] = 0 and [G - MQ, Q] = 0 (6.299) 

implied that G - MQ = al and we decided that we must have a = 0. 

Now, however, we could also have terms involving S of the form 

G - MQ = cS (6.300) 

Higher order powers of S do not contribute any new terms because the commu¬ 
tator, which we can express as 

S x S = iS (6.301) 

indicates that they will reduce to a first-order term. So this result is general. 

Now we found earlier that [G Q ,G/ 3 ] = 0, which implies that c = 0 and thus, we 
still have G = MQ. 

The previous argument to get 

H=MTF +E ^ (6.302) 
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is identical to before except now Eq can be a function of S. Now we must have 

[J,H] = 0^[S,E o ]=0 (6.303) 

which implies that Eq = cS • S is the only possibility. This has no effect on the 
V operator given by 

V = i[if,Q] (6.304) 

since [2?o,Q] = 0. Therefore, the relation 

V = — (6.305) 

M y ’ 

remains valid. 

So everything is the same as in Example 1 except that E 0 now corresponds to 
an internal (spin dependent) contribution to the energy. 

Example 3 - A Particle Interacting with External Fields 

We will only consider a spinless particle. 

Interactions change the time evolution operator and as a consequence, the prob¬ 
ability distributions for observables. 

We assume that the equation of motion for the state vector retains its form 

j t \^(f)) = -iH\i>(t)) (6.306) 

but the generator H changes to include the interactions. We single out H as 
the only generator that changes in the presence of interactions because it gen¬ 
erates dynamical evolution in time and interactions only change that property. 
All the other generators imply purely geometric transformations which are not 
dynamical. 

We also retain the definition V = i[H, Q], but note that interactions (a change 
in H) will affect its value. So we assume that H still satisfies 

V = i[H, Q] (6.307) 

If we shift to a different frame of reference moving uniformly with respect to 
the original frame, we saw earlier that V transforms as 

e iv-Gy e -iv-G = v - vl (6.308) 

Expanding the left-hand side to l st - 0 rder in v gives 

i\y ■ G, V] = -vl and [ G a , Pp] = iS a pMI (6.309) 
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Now, we still have the relation G a = MQ a , since its derivation does not use any 
commutators involving H. 


Our earlier solution for V (no external fields) was 

v=p 

M 

Now the commutators 

[Ga, Vp] = iS a pI and [G a ,Pp] = i8 a pMI 
imply that V = P/M commutes with G. Since G = MQ, we must have 


V- — Q 

M 


= 0 


(6.310) 

(6.311) 

(6.312) 


With no internal degrees of freedom present, the set of operators {Q a } is a 


complete commuting set. This means that we must have 


V-=-—— = a function only of Q 

M M ‘ 

(6.313) 

or 

t , P-i(Q) 

M 

(6.314) 

We now need to solve V = i[H , Q] for H. We have the result 



(6.315) 

A possible solution is 

(P-A(Q)) 2 

0 2 M 

as can be seen from the derivation below. We have 

(6.316) 

[Ho, Qa] = —]— Z [(Pp - ApPp - PpAp - A 2 p ), Qa\ 

(6.317) 

= Z Qa] - [ApPp, Q a ] - [PpAp, Qa] ~ 

[A 2 p, Qa]) 

Now 

[Qa, Pp ] — i&apl and [ Ap , Qa] — 0 

(6.318) 

which gives 



[ApPp, Qa] = Ap[Pp, Qa] = -iApSapi 
[PpAp,Q a ] - [Pp,Qa]Ap = -iApSapi 

[PpiQa] = PpPpQa ~ QaPpPp = Pp[Pp, Qa] ~ i^apPp = “2 lS a pPp 
[Ap, Qa] - 0 
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We then have the final step in the proof 


[Ho, Q a ] ~ Z 2 iS a p{Pp - A p ) = (6.319) 

That completes the proof. Finally, we can then say 

[(H-H 0 ),Q a ] = 0 (6.320) 

which implies that at most H can differ from Hq only by a function of Q 

H - H 0 = W (Q) = function of Q (6.321) 

or, in general 

* (P-i(Q)) 2 

H= (6.322) 

This is the only form of H consistent with invariance under the Galilei group of 
transformations. 

The two new functions of Q are called 

A(Q) = vector potential and W(Q) = scalar potential (6.323) 

Both of the functions can be time-dependent. As operators they are functions 
only of Q and not of P. 

This form certainly includes the classical electromagnetic interaction. However, 
we cannot identify A(Q) and TP(Q) with the electromagnetic potential because 
nothing in the derivation implies that they need to satisfy Maxwell’s equations. 

This method can be generalized to cover the case of more than one particle 
interacting with external fields. 

An Aside 


The conventional notation involving the generators and the constant h is to use 



leading to changed transformation operators of the form 
ia-'P/h /h ^—itH/h 


(6.324) 


(6.325) 


or we could just let ft = 1 and continue to use all previous results unchanged. 
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6.13. Multiparticle Systems 

How do we generalize these single particle results to systems with more than 
one particle? 

We will deal with a special case that illustrates the general procedure without 
adding any extra complexity. 

Consider two particles forming a composite system where 

= operator representing an observable of particle 1 
T ^ = operator representing an observable of particle 2 

We assume that the two particles in the composite system can be separated 
physically so that they would not have any interaction with each other. This 
means that when they are separated, the composite system must reduce to two 
independent one-particle systems which we can describe using all of the one- 
particle results we have already derived. 

Thus, when we find a description of the two-particle system, it must include 
the separate one-particle descriptions in some way. This means that it must be 
possible to prepare the one-particle states as separate, independent entities in 
the laboratory. 

Now, earlier we proved that there exists a state of the system where an ob¬ 
servable represented by an operator has a definite value (probability = 1). 
This state is any one of the eigenvectors of and the definite value is the 
corresponding eigenvalue. A similar result holds for the operator T^ 2 \ 

Now we have assumed that the properties of the two particles can be measured 
independently. This means that a two-particle state vector for the compos¬ 
ite system must exist such that it is a common eigenvector for all operators 
representing observables of both particles. This says that if 

Q (1) I Qm )i = q m \q m )i and f (2) \t n ) 2 = t n \t n ) 2 (6.326) 

then for every m and n there exists a two-particle state vector | q m ,t n ) for the 
composite system with the properties 

^ \qmi^n) — qm\qm^n) and ^ \q m , t n ) ~ t n j(/m, ) (6.327) 

The way to satisfy these conditions is to represent the two-particle state vector 
as a mathematical object known as the Kronecker or direct product. We write 
this (symbolically) as 


| qmit n ) — |^n}2 CU" | qnntn) ~ ® ^ 77)2 (6.328) 
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If the set of vectors {\q m )-[ } spans an M -dimensional vector space and the set of 
vectors {|f„) 9 } spans an TV-dimensional vector space, then the set of all direct 
product vectors spans an M x N dimensional vector space. 

Suppose we have two sets of operators, for the system consisting of 

particle 1 and {Tj 2 ' 1 } for the system consisting of particle 2. We define how 
these operators act on the direct product states by the rules 

Q, (1) \q m ,t n ) = (OP l^h) ® IO 2 (6-329) 

Tj 2) | q m ,tn) = km) 1 ® (T- 2) | t n ) 2 ) 

and we define a direct product between operators in the two sets by the relation 

(Q t (1) ® if'- >) \q m ,t n ) = (Q^ km>r) ® (fj 2) I O 2 ) (6.330) 

When we write Q or T)( 2 ^ alone, we really mean the following 

Q[ x) - ® i {2) (6.331) 

f( 2) / (1) <g>f' (2) 

3 3 

This definition of the direct product operators does not include all relevant 
physical operators that we use in quantum mechanics. When the particles are 
interacting, there must exist interaction operators that act on both sets of states. 
Although this means that an individual direct product state vector for a com¬ 
posite system cannot directly represent interacting particles, we will be able to 
use the set of all such states(which is complete) as a basis for representing states 
of interacting particles. 

Since the common set of states {km,Ai}} comprise a complete basis set for all 
of the and operators, these operators must form a set of mutually 
commuting operators or 

[q( 1 ) ,T)( 2) ] = 0 for all i,j (6.332) 

An Example 


Imagine we are in a fictitious world in which the single-particle Hilbert space 
is 2-dimensional. Let us denote the corresponding basis vectors by |+) and |-). 
In addition, let arbitrary operators f3^ and fi[ 2 ^ be represented by (in their 
respective spaces) by 


/f> = 


/1 i+)i 

u i+)i 

/ 2 (+|/3i 2) |+) 2 

IhM 2) i +> 2 


1 hM 1} 

2 <+iM 2) 

2 (-iM 2) 



b ) 

d 


/ 

h, 


(6.333) 
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These are operators in the space 1 (Vi) and space 2 (V 2 ), respectively. The 
space Vi <8> Vi is therefore spanned by 4 vectors. They are 


l+)i l+)2 “ l + )i ® l+)2 - l ++ ) 1 l+)i H 2 “ l + )i ® H 2 “ l +_ ) (6.334) 

|->1 l + >2 = |->1 ® l + >2 = h+> , |->1 h>2 = h)l ® h>2 = |-> 


We define the general operator (acts in both spaces) 


612 =$ 1} ®/ (2) 

'(++IO12 |++) 
(h—IO12I++) 

(—H 0 12 |++) 

k( -1^12 | + +) 

(a 0 b 0\ 
0 a 0 & 

c 0 d 0 
^0 c 0 d) 


( + +|Oi2|h—)( + +|Ol2|— f ) ( + + IO12I-)'' 

( + -|Oi2]h—) (■*“] O12 |- + ) ( + -IO12I-) 

(—^|Oi2| + -)(—i-| O 12 |— f ) ( _ + |Oi2|-) 

<— |Ol2| + -)<— IO12I-+) (-IO12 I — )/ 


(6.335) 


where we have used the type of calculation below to determine the matrix ele¬ 
ments. 


(++10i21++) = (++I /^ 1} ® i (2) |++) 

= 1 ( + l/^i 1) l+)i 2 (+U (2) |+) 2 
= 1 ( + \Pi 1 ' > l+)i 2 (+ I +) 2 
= 1 (+l/3i 1) l+)i = « 


Similarly, we have 


and 


J (1) ® p[ 2) 


(e f 0 0\ 

g h 0 0 

0 0 e / 

0 g hj 


4 1} ®^ 2) = 


f ae af be bf' 
ag ah bg bh 
ce cf de df 
^eg ch dg dh) 


Therefore, using the Pauli spin operators defined (in each space) by 


(6.336) 


(6.337) 


|=t) — 1^}, d "2 |±) — ^i ("f") , <73 |±) — ± |-f) (6.338) 


so that these operators have the matrix representations (in each space) 


di 



(6.339) 
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We then have 


Thus, 


and 


The states 




0-2 1} ® 0-2 2) = 


^3^ ® = 


a 


W.a(2) 


•cr, = 



(0 

0 

0 

1\ 


0 

0 

1 

0 


0 

1 

0 

0 


u 

0 

0 

0/ 

/O 

0 

0 

-1\ 


0 

0 

1 

0 


0 

1 

0 

0 

V 

-1 

0 

0 

0/ 

/I 

0 

0 

0\ 

0 

-1 

0 

0 

0 

0 

-1 

0 

\0 

0 

0 

1/ 

/I 


0 

0 

0\ 

0 

-1 

2 

0 

0 

2 

-1 

0 

\0 

0 

0 

1/ 


W.a(2) 


f - al + ba\ ; • d. 


f a + b 

0 

0 

0 

0 

a - b 

2b 

0 

0 

2b 

a-b 

0 

0 

0 

0 

a + b/ 


(6.340) 


(6.341) 


(6.342) 


(6.343) 


(6.344) 


l + )l l + )2 “ l + + ) ) l + )l l - )2 “ l +_ ) 

|->ll + >2 = |-+>,|-)l|->2 = |-> 

are given by 


m 


/ON 


/ON 


/ON 

0 

) l +_ ) - 

1 

i b+) - 

0 

, 1—> = 

0 

0 

0 

1 

0 



W 


W 


U 


and we have 


a[^ |++) = a[^ ® |++) = 


/0 0 1 0\ 
0 0 0 1 
10 0 0 
\0 1 0 0/ 


/n 


/ON 

0 


0 

0 


1 



w 


= !-+> 


(6.345) 


(6.346) 


(6.347) 


as expected. 


416 



An alternative way to think about the outer product is shown below: 


;(i) 


a! 2) = 


HHp» f) 


/0 

0 

0 

u 


1\ 

0 

0 

0 / 


Extending This Idea 


(6.348) 


This procedure for constructing composite state vectors using the direct product 
also works when dealing with certain dynamical properties of a one-particle 
state. 


Any one-particle state must represent the various degrees of freedom for the 
one-particle system. These degrees of freedom, as we have seen, are related to 
observables and their associated operators and eigenvalues. Thus, quantities 
like Q a , P/ 3 , and S 7 all represent different degrees of freedom of the physical 
system or state vector. 

In the case of certain one-particle system degrees of freedom, which are said 
to be independent, we can write one-particle state vectors and operators using 
direct products. For instance, as we saw in an earlier example, both Q a and 
P /3 are independent of an internal degree of freedom that we called spin 5 7 . In 
fact, we defined an internal degree of freedom as one which was independent 
of the center of mass degrees of freedom. We defined this independence via 
commutators by [Q a , 5 7 ] = 0 = [P/ 3 , S' 7 ]. 

Another example of independence of degrees of freedom was our assumption that 
[Qon 0/s] = 0, which says that the three components of the position operator are 
independent degrees of freedom. This should not surprise us since it just reflects 
the physical assumption that it is possible to prepare a state where a particle is 
localized arbitrarily close to a single point in 3-dimensional position space, i.e., 
the particle can have a “position”. 

Similarly, our assumption that [P a , Pp\ = 0 means it is possible to prepare a state 
where a particle is localized arbitrarily close to a single point in 3-dimensional 
momentum space, i.e., the particle can have a “momentum”. 

It is also clear, since [Q a , Pp\ + 0 that we will not be able to prepare a state in 
which both Q a and P a (components along the same axis) are simultaneously 
localized arbitrarily close to single point in phase space and since [Jg, J 7 ] + 0 
we will not be able to prepare a state in which two different components of the 
angular momentum have definite values. 

This means that we can write single particle states as direct products. For 
example 

\x) = |xi) <8> \x 2 ) ® |x 3 ) (6.349) 
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X|X 2 X 3 = Xi ® X 2 <® x 3 (6.350) 

* 1 * 2 * 3 |2> = (*i |®i)) ® (* 2 |® 2 » ® (* 3 |*3)) (6.351) 

or 

|5's z ) = |a:i) ® |x 2 ) ® |x 3 ) ® |s z ) (6.352) 

X 1 X 2 X 3 S' z = li®X 2 ®l3®S' z (6.353) 

* 1 X 2 X 3 S Z | x's z ) = (*r |X),) ® (X 2 \X) 2 ) ® (X 3 |*) 3 ) ® (& |2'a*» (6.354) 

We can use words as follows: 

|5) = |a.’i) ® \x 2 ) ® |x 3 ) -» particle at (xi,x 2 , X 3 ) (6.355) 

or 

| x's z ) = \x\) ® \x 2 ) ® 1 ^ 3 } ® |s z ) -*• particle at (xi, x 2 , x$) with spin s z (6.356) 

If we can construct state vectors for two-particle systems, then it must be possi¬ 
ble to define appropriate probability distributions. These are called joint prob¬ 
ability distributions. 

If we prepare both particles into a direct product state such that their individ¬ 
ual preparations are independent and they do not interact with each other, then 
their joint probability distribution for the observables Q and T corresponding to 
the operators Q^ and T^ should obey the statistical independence condition 
we defined earlier for events A, B, and C, namely, 

Prob(A n B\C) = Prob(A\C)Prob(B\C) (6.357) 

For the direct product state 

I i>) = |a)i ® \P) 2 (6.358) 

this says that the joint probability distribution of Q and T is 

Prob({Q = q m ) n (T = t n )\ip) = | ( q m ,t n \H'>} | 2 (6.359) 

= Prob(Q = q m \a)Prob(T = t n |/3) 

= I {Qm | ot) l | 2 | (t n | /?} 2 | 2 

The state does not have to be a pure state for this factorization to occur. It 
only needs to be represented by a density operator of the form 

W = W (1) ® W {2) (6.360) 

Further discussion of this topic follows in section(6.17). 

Later, we will use these direct product procedures to construct operators and 
basis states for multiparticle systems like atoms. Just how we will include in¬ 
teractions is not clear yet. 
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6.14. Equations of Motion Revisited and Finished 

We discussed time dependence and the time evolution operator earlier. This is 
the most important topic in quantum mechanics and will eventually enable us 
to make predictions about the behavior of real physical systems. Let us now 
review our earlier discussion from a more general point of view in light of some 
of the new ideas we have introduced. 

We derived a differential equation of motion for the state vector of the form 

= (6.361) 

If, initially (at t = to) we are in the state represented by the ket vector \ip(to)) = 
|0o)i then we wrote the formal solution of the differential equation in terms of 
the time development operator U(t,to) as 

iV’(d) = ^(Mo)lV’ho)) (6.362) 

We also showed that U(t,to) satisfies the same differential equation as | 

-U(t,t 0 ) = -^H(t)U(t,t 0 ) (6.363) 

ot n 

with the boundary condition U(to,to ) = /. This implies, using the relation 
(AB)t = £Ut that 

(6.364) 

ot h 

We then have 


If H is Hermitian, then H = W 


which implies that 

(t,to)U(t,to) = cl where c = constant (6.367) 

But the boundary condition implies that U'(to,to)U(to,to) = I which implies 
that c = 1. Therefore, in general we have W(t, to)U(t, to) = I, which is the uni- 
tarity condition and implies W = fW 1 or that U is unitary when H is Hermitian. 


= - — U^HU + —U^H^U 
h h 

= l -U\H-H^)U 
h 


(6.365) 


and 


0 


(6.366) 
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This agrees with our earlier results. 

Now, if H(t ) is independent of t, then the differential equation has a simple 
solution 

U(t,t 0 ) = e -i(t - to) T (6.368) 

which is the form we assumed for the time evolution operator during the dis¬ 
cussion of symmetry transformations. 

If H(t) is not independent of t, then no simple closed form solution can be 
given for U(t,to)- We must use perturbation theory as we shall see later when 
we develop time-dependent perturbation theory. 

We can now derive the equation of motion for the density operator. We use a 
pure state for simplicity. We have 

W(t) = \ip(t)} O’it )| = U(t,t 0 )\if(t 0 )) (if(t 0 )\U' i (t,t 0 ) (6.369) 

= U(t,t 0 )W(t 0 )U'(t,t 0 ) 

Now, using W{t(f) = Wq , differentiating with respect to t gives 

d ^-yl^W 0 U\tM,U(t.t c )W 0 d ^^ (6.370) 

at at at 

= - l -HU(t,t 0 )W 0 U\t,t 0 ) + l -U{t,t 0 )W 0 U\t,t 0 )H 
h a 

We will assume that this equation is also true for general states. 

Now, we have stated earlier that no physical significance can be attached in 
quantum mechanics to operators and vectors. The only physically significant 
objects are the probability distributions of observables or their expectation val¬ 
ues. Earlier we derived the result 

(Q) = Tr(WQ) (6.371) 

We now assume that this result carries over to the time-dependent case and we 
have 

(Q) t = Tr(W(t)Q) (6.372) 

Using the expression (6.368) for W(t) and the fact that the trace is invariant 
under cyclic permutation, i.e., 

Tr(ABC) = Tr(CAB) = Tr(BCA) (6.373) 

we get 

(Q)t = Tr(WQ ) = Tr(U(t, t 0 )W 0 ll\t , t 0 )Q) (6.374) 

= Tr(W 0 U\t,t 0 )QU(tAo)) 
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This result is the formal basis of the Schrodinger and Heisenberg pictures we 
discussed earlier. 

If we leave the time dependence in the density operator, then 

(Q)t = Tr(W(t)Q) (6.375) 

where 

dW ^ = and | ip(t)) = (6.376) 

Q is independent of time is these equations. This is the Schrodinger picture. 
On the other hand, if we write 

( Q) t = Tr(WoU\t,to)QU(t,to))=Tr(WoQH(t )) (6.377) 

where Qh( t) = Wit, to)Qtj{t 1 f 0 ), then the operator is time dependent and the 
density operator(and hence the state vectors) are independent of time. This is 
the Heisenberg picture. 

We can derive the equation of motion of the time-dependent operators in the 
Heisenberg picture as follows: 

HlQ H {t) = ^-QU+U^U+U^Q^- (6.378) 

dt at at at 

= U&HQU -tfQHU) + U^U 

h at 

= -{U^HUU^QU - QUU" 1 HU) + U^U 

h at. 

= i^ H it),Q H it)] + ^ H 

where we have included the possibility that Q has some explicit time dependence 
and we have used the definition 

A H (t) = U'(t,to)AU(t,t 0 ) (6.379) 

for any operator. Note the change in sign in front of the commutator in this 
equation for the operator from that of the density operator in the Schrodinger 
picture. 

The two pictures are clearly equivalent as mathematical formalisms, since they 
are derived from the same expectation value formula by an internal rearrange¬ 
ment of terms. Another way to say this is that the two pictures are equivalent 
because the only physically significant quantity (Q)t depends only on the rela¬ 
tive motion in time of W and Q, which is the same in both cases. 
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In the Schrodinger picture W ( t ) moves forward in time (term of the form 
UWqU' = UWqU~ 1 ) and in the Heisenberg picture Q(t) moves backward in 
time (term of the form WWoU = U~ l WqU). The opposite senses of motion in 
time produce the sign difference of the commutator terms in the equations of 
motion. These two picture are mutually exclusive and cannot be used together. 

Finally, we determine expressions for d(Q)t./dt in each picture. 

In the Schrodinger picture 

c ^ = l T r(W(t)Q) = Tr\ d ^Q + W^ 1 (6.380) 

at at at at 

= Tr -UHWQ-WHQ^ + W 9 ^ 
h at 

= Tr -UwQH -WHQ) + W^~ 
h at 

= Tr 1 W(t)[H,Q] + W ^ 

In the Heisenberg picture 

^91l = Tr(W 0 ^^)=Tr ^W 0 [H, Q H (t)] + W 0 (6.381) 

For a pure state, we can rewrite these results in terms of the state vectors instead 
of the density operator. We have in the Schrodinger picture 

(Q)t = (i>(t)\Q\ip(t)) where \ip(t)) = U(t, t 0 ) \ipo) (6.382) 

and in the Heisenberg picture 

(Q)t = (V’olQ/fW IV’o) where Q H (t) = U\t,t 0 )QU(t,t 0 ) (6.383) 

as we saw in our earlier discussions. 

6.15. Symmetries, Conservation Laws and Station¬ 
ary States 

Let T(s) = e zsK represent a continuous unitary transformation with a Hermi- 
tian generator K = K'. Another operator A representing some observable is 
invariant under this transformation if 

T(s)Af~ 1 (s) = A (6.384) 

or 

if (s) - T(s)A = [A, f (s)] = 0 (6.385) 
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(6.386) 


If s is an infinitesimal, then we can write 

(/ + isK)A(l - isK ) = A -*■ A + is[K, A] = A 
or 

[K,A]= 0 (6.387) 

In words, the invariance of A under the continuous transformation T(s) = e lsK 
for all s implies it is true for infinitesimal s and thus leads to the commutator 
condition for invariance, which says that the operator commutes with the Her- 
mitian generator of the transformation. 

It works both ways: 

[T, A\ = 0 -* invariance under finite transformation 

-*■ invariance under infinitesimal transformation 

-+[K,A\ = 0 

or 

[K, A] = 0 invariance under infinitesimal transformation 
-*■ invariance under finite transformation 

-+[T,A] = 0 

If K depends on t, then the commutators [A'(t),A] = 0 and [T(t), A] = 0 must 
hold for all t. 

Now, the Hermitian generators of the symmetry transformation, as we have 
seen, correspond to dynamical variables of a physical system. 

space displacements <=> P 

rotations <=> J 

t displacements <=> H 

These symmetry generators have no explicit time dependence, i.e., dK/dt = 0. 
Therefore, 

^P 1 =Tr\l r W(t)[H,K]] = 0 (6.388) 

dt I h 

if H is invariant under the corresponding symmetry transformation, i.e., [H, K ] = 
0. Now [H, K ] = 0 -»■ [ H , f(K)] = 0. This says that we must have 

[H,9(x-K)] = 0 (6.389) 

But in our probability discussions, we showed that 

(9(x - K)) = Prob(K < x\W ) 

= probability that observable K has a value < x given W 
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This then says that Prob(K < x\W) is independent of t no matter what initial 
state we start with. In this case, observable K = a is a constant of the motion. 

Examples: 


[. H , Pa] = 0 implies invariance under a space displacement along the a-axis. This 
implies that P a = constant of the motion (is conserved) = linear momentum 
along the a-axis. 

[H, J a \ = 0 implies invariance under a rotation about the a-axis. This implies 
that J a = constant of the motion (is conserved) = angular momentum about 
the a-axis. 

If H is not an explicit function of t, then since [H, H] = 0, H is invariant under 
time translations = constant of the motion = energy of the system. 


Now suppose that H is independent of t and that 

1-0(0)) = [ eigenvector of H ] = \E n ) such that H\E n ) = E n \E n ) (6.390) 
Then we have, using [H,U] = [H, f(U)] = 0, 

j t \m) = —Hm)) = ~HU(t,0) |0(O)) (6.391) 

= -^HU(t,0)\E n ) = - jHU(t,0)H\E n ) 
a h 

= - l -HU(t,0)E n | E n ) = - jE n HU{t ,0) \E n ) 
h a 

= - l -E n \i,{t)) (6.392) 

h 

which has the solution 

\iP(t))=e i ^ t \E n ) (6.393) 

In this case, we then have for the expectation value of any observable represented 

by the operator R 

(R) = (ip(t)\Rm)) = (E n \e i ^ Lt Re~ i ^ t \E n ) = (E n \R\E n ) (6.394) 


or (R) is independent of t (for this state). This implies that (f(R)) is also 
independent of t, which finally, implies that (9(x-R)) is independent of t. This 
means that, in this state, 

Prob(x < R\i/j) is independent of t, (6.395) 

This kind of state is called a stationary state. In a stationary state the expec¬ 
tation values and probabilities of all observables are independent of time. If an 
observable is a constant of the motion, however, then this would be true for 


424 



any state and not just a stationary state. So these are very different physical 
concepts. 

Now, if [K,H] = 0, then K and H have a common set of eigenvectors (it is 
a complete set). But the eigenvectors of H are stationary states. This means 
that we can prepare systems in stationary states where both the energy and the 
observable represented by K have definite values (no dispersion). 

Suppose we have a set of mutually commuting observables and that they all also 
commute with H. Then they have a common set of eigenvectors. 

We will use the eigenvalues of H and all the eigenvalues of this mutually com¬ 
muting set of observables to label state vectors. We will call the labels quantum 
numbers. They will designate all we know about a state vector. 


6.16. The Collapse or Reduction Postulate 

Based on unitary time evolution postulate, a system consisting of a quantum 
system (Q-system) and a measurement system (M-system), would necessarily 
evolve in this way 

I initial) = (o|+) Q + b\-) Q ) |0) M (6.396) 

| final) =-> a |+)q |+1) M + b\-) Q |-1) M 

which is a superposition of Q-states and M-states. We assume that the M-states 
represent macroscopic pointer locations on some meter. 

This says that time evolution, within the framework of the standard postu¬ 
lates, CORRELATES or ENTANGLES the dynamical variable (Q-system) to 
be measured and the macroscopic (M-system) indicator that can be directly 
(macroscopically) observed. 

Derivation: Suppose that the meter has eigenvectors (labeled by eigenvalues) 

\+) M => meter on: reading = +1 
\-) M => meter on: reading = -1 
|0 )m ^ meter off: reading = 0 

and the system has eigenvectors (labeled by eigenvalues) 

| + ) Q =► value = +1 
| —)q => value = -1 

The initial state is 

| initial) = (a|+) Q + b\-) Q ) |0) M (6.397) 
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which represents the quantum system in a superposition and the meter off. 


We are interested in the evolution of this state according to quantum mechanics. 
If, instead of the above initial state, we started with the initial state 

\ A ) = I+>qI°)m (6-398) 

and then turn on the meter, this state must evolve into 

I^> = I+>qI+1>m (6-399) 

indicating that the meter has measured the appropriate value (that is the defi¬ 
nition of a "good" meter). 

Similarly, if, instead of the above initial state, we started with the initial state 

\ B ) = |->qI°)m (6-400) 

and then turn on the meter, this state must evolve into 

\ B ') = |->q|-1>m (6-401) 

indicating that the meter has measured the appropriate value (again, that is 
the definition of a "good" meter). 

If the system is in the initial state corresponding to a superposition of these 
two special states, however, then the linearity of quantum mechanics says that 
it must evolve into 


I final) = a |+) Q \+l) M + b |-) Q \-l) M (6.402) 

as we assumed above(6.395). 

Interpreting the state vector: Two models.... 

1. Pure state \ip) implies a complete description of an individual Q-system. 
This corresponds to the statement that a dynamical variable P has the 
value p in the state | if) if and only if P\ip) = p\if). 

2. Pure state \ip) implies statistical properties of an ensemble of similarly 
prepared systems. 

Interpretation (1) is the standard interpretation espoused by 90% of all physi¬ 
cists. It assumes that, because the state vector plays the most important role 
in the mathematical formalism of QM, it must have an equally important role 
in the interpretation of QM, so that 

Properties of world -o- Properties of | if) (6.403) 
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Interpretation (1) by itself is not consistent with the unitary evolution postulate, 
that is, the state | final) as defined in (6.401) is not equal to an eigenvector of 
any indicator (macroscopic pointer) variable. This means that the pointer (of 
the meter) will flutter since the |±) states could be macroscopically separated. 
Since we never observe this flutter, any interpretation of | final) as a description 
of an individual system cannot be reconciled with both observation and unitary 
time evolution. 

Interpretation (2) has no such difficulties. \f>) is just an abstract mathematical 
object which implies the probability distributions of the dynamical variables of 
an ensemble. It represents a state of knowledge. 

Physicists that believe interpretation (1) are forced to introduce a new postulate 
at this point to remove these difficulties. This is the so-called reduction/collapse 
of the state vector postulate, which says that during any measurement we have 
a, new real process which causes the transition 

\final) -» a|+) Q |+l) M or 6 |-) Q |-1) M (6.404) 

so that we end up with an eigenvector of the indicator variable and thus there 
will be no flutter. 

Various reasons are put forth for making this assumption, i.e., 

measurements are repeatable 

Since this experiment (where the repeated measurement takes place immedi¬ 
ately after the first measurement) has never been realized in the laboratory, 
I do not know what to make of a requirement like this one. In addition, in 
many experiments (like those involving photons), the system is destroyed by 
the measurement (photon is absorbed) making it silly to talk about a repeat- 
able measurement. 

The fact that the reduction process has never been observed in the laboratory 
makes it hard to understand in what sense it can it be thought of as a real 
physical process. 

It is important to note that this difficulty only arises for interpretation (1) where 
statements are made about state vectors representing individual systems. 
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Some Proposed Mechanisms for the Reduction 

1. The reduction process is caused by an unpredictable and uncon¬ 
trollable disturbance of the object by the measuring apparatus 
(a non-unitary process). 

This means that the Hamiltonian of the system must take the form 

H = Hq + Hm + Hqm where Hqm -»■ disturbance (6.405) 

This means, however, that it is already built into the standard unitary time 
evolution via U = e~ lHt ^ h and, thus, the disturbance terms can only lead 
to a final state that is still a superposition of indicator variable states. IT 
DOES NOT WORK unless we are not told what is meant by unpredictable 
and uncontrollable disturbance! 

2. The observer causes the reduction process when she reads the 
result of the measurement from the apparatus. 

This is just a variation of (1). Here, the observer is just another indicator 
device. The new final state becomes 

I final) =a\+) Q \+l) M \ sees + 1 ) 0 + b |-) Q |-1) M |sees - 1) 0 (6.406) 

which is still a superposition and thus is NO HELP. It also introduces 
consciousness into QM and that, in my opinion, is just silly! 

3. The reduction is caused by the environment (called decoher¬ 
ence), where by environment is meant the rest of the universe 
other than the Q-system and the M-system. 

In this model, the environment is a very large system with an enormous 
number of degrees of freedom. We do not have any information about 
most of the degrees of freedom and thus must average over them. This 
causes pure states to change into nonpure or mixed states in a non-unitary 
process as we will see later in the book. 

Why do many physicists think an individual Q-system must have its own state 
vector or wave function and then assume the collapse postulate? 

IT WORKS for doing calculations! 

This view has survived so long because it does not lead to any serious errors in 
most situations. Why? 

In general, predictions in quantum mechanics are derived from \ip) which gives 
the wave function and which, in turn, gives the probabilities. The operational 
significance of a probability is a relative frequency so that the experimentalist 
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has to invoke an ensemble of similar systems to make any comparisons with the¬ 
ory that is independent of any particular interpretation of the wave function. 
So that interpretation (2) is being used in the end anyway. 

Does this mean that we should stop worrying about the interpretation of the 
wave function? NO! 

But that is the subject of another book. 

In this book, we will not be dealing with such questions, that is, we do not ask 
questions that require the collapse postulate and use a different mathematical 
formalism for quantum mechanics. 

What about interpretation (2)? It says that 

A pure state describes the statistical 
properties of an ensemble of similarly 
prepared systems. 

This means that in many situations we must use the density operator W or p 
as the fundamental mathematical object of quantum mechanics instead of the 
state vector. 

It turns out that some systems only have a density operator p and do not have 
a legitimate state vector | ip). 

For example, consider a box containing a very large number of electrons, each 
having spin = 1/2. As we shall see later, this means the spin can have a measur¬ 
able component = ±1/2 along any direction. An oriented Stern-Gerlach device 
measures these spin components as we will see later. 

Now, suppose the box has a hole so that electrons can get out and go into a 
Stern-Gerlach device oriented to measure z-components (an arbitrary choice). 
We will find the results 

+ ^ 50% of the time and - ^ 50% of the time (6.407) 

We then ask the question - what are the properties of the electrons in the box? 

There are two possibilities, namely, 

1. Each individual electron has the same state vector 

Mq = 4 \z = +1/2) + ±= \z = -1/2) = \tp) box (6.408) 
which is a superposition. 
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2. 1/2 of the electrons have z - +1/2 and 1/2 of the electrons have 2 = -1/2 
so that 

\ip) Q = j z = +1/2) OR \z = -1/2) (6.409) 

so that 

IV 7 }BOX = \ Z - +1/2) + 

which seems to be the same state | i/j) box as (1)> but it really NOT a 
superposition state in this case. 

Therefore, it seems that we will not be able to tell which possibility is the correct 
one! 

However, it will turn out that 

\x - component = +1/2) = —= |z = +1/2) + —— |z = -1/2) (6.411) 

so that, in case (1), if we orient the Stern-Gerlach device to measure x-components 
we would find all the electrons are in the same state \x - component = +1/2), that 
is, they are all the same! 

On the other hand, in case (2) since (as we will see later) 

\ z = ±1/2} = |* = +1/2) ± 4 I* - !/2> (6-412) 

we would find that 

+ - give the \x = +1/2) result and + - give the \x = -1/2) result (6.413) 

Therefore, the states are not the same! If we try to write a state vector for case 
(2) we have to write 


\z = -1/2) (6.410) 


I ^>Q = 



+ 1 / 2 ) + 



(6.414) 


instead of 

Mbox ~ \ z ~ ± 1 / 2 ) + \ z ~ - 1 / 2 } 


(6.415) 


where a is a completely unknown relative phase factor, which must be averaged 
over during any calculations since it is different for each separate measurement 
(each member of the ensemble). With that property for a, this is not a legitimate 
state vector in my opinion. We note that in a true superposition, the relative 
phase factors between components is known exactly ! 


If we use density matrices we have a different story. For a pure state we can 
always write p = \ijj) (ip\ for some state vector \if). 
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(6.416) 


In fact, case (1) gives 

p = 1 (11/2) (1/2| + 11/2) (-1/2| + |-l/2) (1/2| + |-l/2) <-l/2|) 



where, as we saw earlier, the diagonal matrix elements represent probabilities. 
The existence of the off-diagonal matrix elements implies that we will observe 
quantum interference effects in this system. 

Clearly, any pure state density operator cannot be written as the sum of pure 
state projection operators as we proved earlier. 

In case (2), however, we have 

P = 2 (I 1 / 2 ) U/ 2 I + 1-1/2) (-1/2|) => \ (J fj (6.417) 

which clearly is the sum of pure state projection operators. This corresponds 
to a nonpure or mixed state. Note that the off-diagonals are zero so that this 
density operator cannot lead to any quantum interference effects as we might 
expect. 


If we treat case(2) as a pure state with the extra relative phase factor we would 
obtain 


P~ \ (11/2) (1/2| + e~ ia 11/2) (—1/2| + e ia 



which becomes 



1/2) (l/2| + |-l/2) (-1/2|) 

(6.418) 


(6.419) 


when we average over a. The decoherence process has this effect on a very short 
time scale. 


6.17. Putting Some of These Ideas Together 

6.17.1. Composite Quantum Systems; Tensor Product 

Let us now look at composite systems again but now in the context of a special 
kind of state called an “entangled” state, which illustrates some of the more 
dramatic features of quantum physics. 

Most of our discussions so far apply easily to quantum systems comprised of 
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only one part, i.e., a single particle. As we will see shortly, it will be straightfor¬ 
ward to use this formalism to deal with a single particle evolving in the presence 
of external fields. In these cases, the external fields are treated as ordinary clas¬ 
sical fields. 

We did not, however, attempt to solve the system at the level where the particle 
is interacting(quantum mechanically) with the other particles that are actually 
generating the external fields. In this case all parts of the system must be dealt 
with using quantum mechanics. 

We also indicated earlier in this chapter how we might set up a such a multipar¬ 
ticle system, without, however, indicating how this formalism might be used. 

We now redo the multiparticle formalism and expand our discussion in sev¬ 
eral directions with the goal of describing a system where all the particles are 
interacting quantum mechanically. 

Hilbert Space for Individual Quantum Systems 

If we have a quantum system, then we can describe it as a vector 

\ip) = ci |^i) + c 2 | $ 2 ) + ... + c N \4 >n) (6.420) 

with respect to a set of basis vectors \4>i). The span (or set of all possible linear 
combinations) of these basis vectors make up the Hilbert space 

^ = (6.421) 

along with the inner product ((pi \<pj) = Sij. Writing out these basis vectors, we 
usually pick an ordering and assign them the unit vectors, 



n\ 


/o\ 


/o\ 


0 


i 


0 

l<M = 

0 

. IM = 

0 

> ■ ■ • \4>n ) = 







a. 


Individual quantum systems live in their own individual Hilbert spaces as shown 
in the Figure 6.4 below. 
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Figure 6.4: Quantum systems are described by vectors in own Hilbert space 


We know that we can write every possible state of the system as some superpo¬ 
sition of the basis vectors. 

We usually think about the standard projectors associated with the basis vectors 

A = |0i) (0i| (6.423) 

given by the outer products of the different basis vectors with its own dual 
vectors(or linear functionals) such as, 


A = |0i) (0i I 


/IN 



(1 

0 • 

• 0 

0 

0 

(l 0 0 • 

•)= 

0 

0 • 

• 0 

■) 



^0 

0 • 

• o t 


(6.424) 


As we discussed earlier, the essential usefulness of the standard projectors is that 
their expectation values give us the probability that if we measured a system 
|0) to determine which state it is in, we would get the different basis vectors 
with probability of the form, 

Prob(\(f>i) = (0| A |0) = |ci| 2 (6.425) 

We also remember that a key property of the projectors is that they sum to the 
identity 

N 

Z A = i (6-426) 

2=1 


Two-Level Systems 

To make the discussion less unwieldy, we will work with quantum systems that 
live in a 2-dimensional Hilbert space, such as a a spin-1/2 particle or photon 
polarization (both of which will be discussed in detail in later chapters). For now 
we only need to know that our physical system (particle) has two eigenstates of 
some observable when it is measured in any direction (in physical space). We will 
call these states up and down in the direction of measurement. In particular, we 
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choose our basis to be given by the (up,down) states measured in the z-direction, 
which we designate as 

W = {|t>,im (6-427) 

Notice that we picked an ordering for the basis vectors and we can therefore 
assign unit vectors to them, 

It) = (j) . = (6-428) 

such that any state of the particle can be described as 

|-0> = c t |t) + cj. |4.) (6.429) 


If we performed a measurement of the observable in the z-direction, we would 
get two outcomes with probabilities 

Prob{\) = {ip\ -Pf W) ~ l c t | 2 , Prob(l) = {il>\ P i \ip) = |cj 2 (6.430) 


using the corresponding projectors 




(6.431) 


That is basically everything we need to know about a single particle. 


Hilbert Space for Composite Systems 

Let us begin building up the Hilbert space for two distinct particles (two distinct 
quantum systems). For example, suppose that I have a particle in my lab and 
you have a particle in your lab. They have never come in contact with one 
another and for all intensive purposes, I have no idea what you have done with 
your particle and vice versa. In this case, it seems to make perfect sense that 
we could just treat the particles completely independently at the level of their 
Hilbert spaces. We would have something like that shown in Figure 6.5 below. 



Figure 6.5: Hilbert space for 2 quantum systems independent of one another 
We really should be able to think about these systems as entirely disjoint. 
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Therefore, we can define two different Hilbert spaces, 


n A = m A M) A } 


(6.432) 


and 

n B = { (6.433) 

with operators such as projectors that only act on states in their respective 
systems 

Pf = lt>A (tU . = li>A UL > I A = pA + (6-434) 

and 

pf = lt>B <t| B , = Mb 41b , i B = P B + P* (6-435) 

In terms of their matrices, for example, 


p A 

M 



p B 



(6.436) 


these operators look identical to one another. However, they are not really 
identical, because you are only allowed to use operators (matrices) with A labels 
on states with A labels and operators with B labels on states with B labels. 
These rules reflect the situation that the particles are separate physical systems, 
possibly in distant locations, that have no idea that the other even exists. 


Tensor Product of Hilbert Spaces 

Now suppose we bring our individual particles from our independent labs to¬ 
gether. In this situation, it is not clear whether we can get away with describing 
the two systems using two separate Hilbert spaces. For example, what if our 
particles interact with one another? can we still describe them as independent 
systems? It is not clear. 

In order to be safe, we had better assume that we cannot still treat the systems 
as living in their own spaces and we must now assemble a suitable composite 
Hilbert space. The process is shown in Figure 6.6 below. 



Figure 6.6: Hilbert space for 2 quantum systems independent of one another 
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In order to do so, let us begin at the level of describing the basis vectors of our 
new space. System A involves up and down basis vectors and so does system 
B. So, at the level of the basis vectors, if system A is up, B can be either up or 
down, corresponding to 

{|tUt> B ,|t> A |f> B } (6.437) 

or system A can be down and system B could be either up or down 

(6.438) 

Therefore, we build our composite Hilbert space with the four basis vectors 

P-AB = { 1 1)^4 |t) b ; It)^ I'l') B ’ 1^) B ’ 1^) B^ (6.439) 

What are these funny objects involving some sort of product of basis kets? They 
cannot be any normal type of matrix-vector multiplication since you cannot 
multiply a column vector by a column vector. Instead, let us proceed as follows. 
Given our ordering of the four basis vectors, we can associate them with unit 
vectors. However, since there are now four basis vectors, our Hilbert space must 
be 4-dimensional, 


|t>Jt>B 

I-Oa I Ob 


(1\ 


/°\ 

o 

5 |t),4 \1)b - 

1 

0 

0 

\oJ 


\o) 

/°\ 


/°\ 

0 

1 

7 |i)^4 |i)_B “ 

0 

0 

\oJ 


U; 


(6.440) 


This method of combining two 2-dimensional Hilbert spaces into a single 4- 
dimensional Hilbert space is known as a tensor product. We discussed this 
earlier and gave it a special symbol <8>, 


Pab = Pa ® Pb 


(6.441) 


to indicate that we are multiplying vector spaces together. Notice that the 
dimension of the tensor product Hilbert space is the product of the dimensions 
of the individual spaces. 


Of course, since we have four basis vectors, we must now have four standard 
projectors of the type 

Af S = lt>Alt>B^<tMt| (6.442) 

This notation gets really clumsy after a while, so it is convention to shorten it 
to 

Pn B = ltt>AB (ttl , Pn B = |U>ab (Ul (6-443) 

p^ t = i-i-Oab mi i Ph = ih)ab mi 
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Tensor Product of Matrices (Repeating earlier ideas for clarity) 

We can compute the matrix representations of the projectors for our composite 
system by multiplying out each basis vector with its dual vector. These are 
4-dimensional matrices 

1 0 0 0 \ /0 0 0 0 

0 0 0 0 p AB 0 1 0 0 

0000 ’ '0000 

0 0 0 0 / \0 0 0 0 

/0 0 0 0\ /0 0 0 0\ 

pAB 0 0 0 0 pAB 0 0 0 0 

It 0 0 1 0 ’ u 0 0 0 0 

^0 0 0 0/ \0 0 0 1, 

Of course, it would be nice to have a systematic method for constructing oper¬ 
ators on the tensor product Hilbert space from operators that act only on the 
individual Hilbert spaces. This is definitely not the standard matrix product, 
which we can see by looking at the projectors 

P$ B * PfPf (6.445) 




since 


/I 0 0 0\ 

0 0 0 0 /l o\/i oWi o\ 

o o o o *[o o/^o o/~^o oj 

^0 0 0 0 / 


(6.446) 


i.e., clearly their dimensions do not match. Instead, we need a tensor product 
for the projection matrices that reflects the same structure as the composite 
Hilbert space that we constructed. We symbolize such a product as 


pAB _ pA 
Mt ~ M 




(6.447) 


How do we perform this tensor product between matrices? Well first, it needs 
to be an operation that yields a matrix whose dimension is the product of 
the dimensions of the matrices being multiplied. Second, it has to respect the 
definition of the ordering that we used to construct the tensor product Hilbert 
space. Such a product is defined by the following 

(6.448) 

v (Yu Y V2 \\ 

^12 I y V ] 

W21 ^22/ 

Y (Yu H12) 

^22 I y ~\A I 
\221 r 22 // 
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In other words, we take each element in the first matrix and replace it with a 
copy of the second matrix scaled by the element. This process, called the matrix 
tensor product seems to do the trick. We can check for projectors 


jpAB 

r tt 


p 


/ 


•A B = 


((l)pf (0 )P{ 


l (0 P t B 


( 1 ) 

v < 0) 

/I 0 0 0\ 
0 0 0 0 
0 0 0 0 
\0 0 0 0 / 


(0 )P{ 


(l 

o' 

(0)1 

(1 

o' 


V 
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0 


h 

0* 

(0)1 

(l 

0 


1° 

0 ; 



) 


and the result checks out. You can verify it for the other projectors on your 
own. 


6.17.2. Quantum Entanglement and the EPR Paradox 

The operators 


P A - 

M ~ 


(5 S) o) 


(6.449) 


look identical to one another. However, they are not really identical because 
you are only allowed to use operators (matrices) with A labels on states with A 
labels and operators with B labels in states with B labels. These rules, as we 
said earlier, reflect the situation that the particles are separate physical systems, 
possibly in distant locations, that have no idea that the other even exists. 


States in the Tensor Product Hilbert Space 

These ideas lead to a number of intriguing properties about composite quantum 
systems that we will now discuss. 

Suppose that we took the state 

\ip) A = c A \V A + cf\l) A tH A (6.450) 

from Alice’s lab, and the state 

\P)b = c t lt>s + c j. 10 b 6 P-b (6.451) 

from Bob’s lab. Alice and Bob both prepared their systems completely inde¬ 
pendently - with no communication between them to indicate what they were 
doing in their own labs. When we bring their systems together, we should 
express them as vectors in the composite Hilbert space 

IP) A \P)b ~ ( c t lt)A + c i 10a) ® ( c t lt)s + c i II)b) € ^^b (6.452) 
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After taking the tensor product between these states, we get 

V! } )aB = C t C t l1't)AB + c T c t + c i c t I'I'OaB + c t c t I-I"I')aB (6.453) 

The important point to notice in this expression is that there is a special re¬ 
lationship between the coefficients: the coefficient for each basis vector is the 
product of the coefficients for the individual basis vectors in this case. 


Quantum Entanglement 

In (6.452), we found the expression for two, unrelated, independent quantum 
systems expressed in the Hilbert space of the composite system. However, we 
know that, in general, we can describe an arbitrary state in 1-Lab as an arbitrary 
superposition of the four basis vectors 

Mab ~ c tt Itt ) ab + c ti Itl) ab + c it I-It )ab + C U I-I'-Oab (6.454) 

This is the most general possible expression for the composite states since we 
made sure to use the most general basis possible and we imposed no relationship 
between the expansion coefficients (other than the state must be normalized). 


Here is the big question: can every state of the form (6.453) 

W)AB - C tt Itt )AB + C tl |tt)AB + C it lit )ab + C W |-W)aB 
be written in the product form (6.452) 

\4’)aB = C t C t ltt)AB + C t C J. |tl)AB + c i c t ltt) J 4B+C|C| |.|4 )ab 
The answer is a resounding no. Consider the simple example 


WaB ~ It^AB + ltt)AB “ 


/ 0 \ 

1 

1 

\0 / 


(6.455) 


Here, if we were to try and find a product state expression for this special 
composite state vector, we would first infer that the product of the individual 
basis vectors must satisfy 


AB 
c t c i 


_ B 
c l c t 


1 

7^ 


(6.456) 


and 


cf cf = cf cf = 0 (6.457) 

which are not consistent with each other, i.e., there is no solution! 

Therefore, we must conclude that the state (6.454) 


1 ^)ab ~ H^)ab + Mab ~ 


/ 0 \ 
1 
1 

Vo/ 
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cannot be expressed as a product between states in HLa and HLb- States in 
a composite Hilbert space, such as this one, that cannot be factorized into 
a product of states in the constituent Hilbert spaces are referred to as being 
entangled. 


Consequences of Entanglement 

I strongly believe that entanglement is the most important difference between 
quantum and classical physics. Let us see why by considering the following 
situation. Suppose that Alice and Bob get together in a common lab and prepare 
the joint state (6.454) 


)ab= V2 ln>AS+ 72 llt>AB= V2 


/ 0 \ 

1 

1 

Vo/ 


After it is made, Alice takes one of the particles and Bob takes the other (we 
can assume Alice takes the first one and Bob takes the second without loss of 
generality). Then both of them return to the individual labs without changing 
the state of their respective particles. When she gets back home, Alice decides 
to perform a measurement on her particle to see what state it is in. Such a 
measurement, performed by Alice on only part of the pair of particles, would 
be described by operators such as the projectors (remember projectors give us 
probabilities) 


and 


P t A ® Ib 


(i !)(■ 


/i 



0 0 0 \ 

1 0 0 

0 0 0 

0 0 0, 


PC ® i B 


(! ?)(■ 


/ o 



0 0 0\ 

0 0 0 

0 1 0 

0 0 1; 


(6.458) 


(6.459) 


Notice that these projectors have the correct dimension since they are the ten¬ 
sor product of 2-dimensional operators. Also notice that this operator should 
be interpreted as doing something to Alice’s particle (the projector part) and 
nothing to Bob’s particle as indicated by the identity operator acting on Bob’s 
system. The identity operator is the quantum mechanical way of saying that 
you did not do anything. 


Operators of the above form, a tensor product of a projector on one component 
Hilbert space with the identity on the other, are called partial projectors. First, 
let us compute the probability that when Alice measures her particle she obtains 
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the up outcome 


Prob A ( T) = ab (ip\Pf®I B \ip) AB 


1 1 

as well as for the down outcome 

Prob A ( f) = ab (V'l PC ® Ib 
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(6.460) 


(6.461) 


But here is the truly amazing part. What state do we have following Alice’s 
measurement? Suppose that she obtains the up outcome, then(by the reduction 
postulate) we get the measurement eigenstate 


W) 


P A < 


1 I B 


AB 




I AB 


ab (V’l P A ® Ib Wab 
where the denominator is to maintain the normalization. 


/ 0 \ 
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Vo/ 


= lu> 
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(6.462) 


Similarly, when Alice measures down for her particle, the result is the projection. 

/ 0 \ 


W) 


Pi ® Ib 
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\JAB (V’l ® I B I 
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Vo/ 


= Ut> 
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(6.463) 


The interpretation of this result is that Alice knows what Bob’s state is as soon 
as she performs her measurement. Therefore, a local operation in her lab tells 
her global information about the composite state. 


This prediction of quantum mechanics was considered so bizarre that it prompted 
Einstein, Podolsky and Rosen to essentially denounce quantum mechanics as ei¬ 
ther being wrong or at least incomplete (we will define this carefully later in the 
book). In order to make their point, the three authors proposed the following 
situation, which today is called the EPR Paradox(after their initials). EPR 
argued that Alice and Bob could get together and prepare the state just dis¬ 
cussed. Then Bob would climb aboard a rocket and fly to a distant planet. At 
that point, Alice and Bob would both measure their states and instantaneously 
know the result of the other person’s experiment - despite the long distance be¬ 
tween them. This seemingly violates special relativity since it would mean that 
Alice’s and Bob’s physical systems somehow exchanged information at speeds 
faster than the speed of light. 
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6.17.3. Entanglement and Communication 

To Communicate Superluminally, or Not 

The immediate question we may ask is whether or not this bizarre information 
gain(what Alice learns about the state of Bob’s particle) allows them to com¬ 
municate faster than the speed of light. The answer is absolutely, definitely, 
certainly, NO\ The short answer is that the randomness of the measurement 
outcomes saves us, but, this question was still of great concern during quantum 
mechanics’ infancy, since it led to a number of apparent paradoxes that took 
some time to resolve. We will now dispel these rumors of superluminal commu¬ 
nication once and for all. 

In order to do so, we begin from the state shared by Alice and Bob 

H’+)aB = |tl)AB + (6.464) 

and suppose that Alice measures the observable of her state along the z-direction(as 
we have been doing). Then her possible outcomes correspond to the partial pro¬ 
jectors 

<8> I B = |t) A (tl ® i B and ff M <8> I B = |f ) A (f | ® I B (6.465) 

When Alice measures her up outcome along the z-direction, her state transforms 
to 

tz : \P+) AB ""*■ / I P+) AB = ItOzlB (6.466) 

yj ab(P + \P£® i bVI> + ) AB 

and when she measures down outcome along the z-direction she gets 

lz'- \P+)aB “*■ / ^ 1 I P+) AB = I't't )AB (6.467) 

V AB ( 0+1 P £ ® J B \P+)aB 

Now, what if Alice decides to perform an alternative procedure given by measur¬ 
ing whether her particle is up or down along the ^-direction. Now the projectors 
are given by (using corresponding eigenstates for the x-direction which we will 
derive for spin and polarization in later chapters) 

P t >7 B =2(|t> + |I»(t| + UI)®/ B (6.468) 

and 

P 1 >/ B =^(|t>-|I»(t|-UI)®/ B (6.469) 

i.e., we use 

It,) = 4(|f> + It)) and |f,) = 4(| t > - |f)) (6.470) 
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in forming Alice’s projectors. If we work though the algebra we find for the 
resulting state when Alice obtains up outcome in the ^-direction 

tx : \<P+) AB ^ \' 1 P+)aB ~ It art x)ab (6.471) 

^AB{ll> + \ P U® I B\i’ + ) AB 

meaning that when Alice measures up in the ^-direction, then Bob also measures 
up in the ^-direction. Conversely, when Alice measures down in the x-direction, 
the resulting state is 

pA ^ j 

lx'- Vl , + ) AB / r - \ 1 P+)aB ~ \ixix)AB (6.472) 

Vab(V’+I-P^®/b|V’+>ab 

and Bob also measures down in the x-direction. 

Using this for a Communication System 

We can now see whether it is possible for Alice and Bob to communicate by 
choosing one basis or another when they want to send different bits of data. For 
example, suppose that Bob wishes to send Alice binary zero. He might try to 
do so by selecting to measure in the 2 -direction (use the 2 -basis). And when 
he wishes to send logical one, he would then measure in the x-direction (use 
the x-basis). The two parties could agree upon the following communication 
protocol: 

1. Bob selects a basis, either 2 or x, to send logical zero or one, respectively 
and measures. If he measures up in that basis, the bit is considered good, 
otherwise he throws away the bit. 

2. When Alice chooses to measure in the 2 -basis (provided Bob also mea¬ 
sured in the 2 -basis), she know that a logical zero has been sent when 
she measures down (meaning Bob found up since the measurements are 
anti-correlated). When Alice measures in the x-basis (provided that Bob 
measured in the x-basis), she knows that Bob sent her a logical one when 
she measures up (also implying that Bob measured up since the measure¬ 
ments are correlated). 

Unfortunately for people like Einstein (who were trying to find quantum me¬ 
chanical paradoxes), this communication scheme conveys no information from 
Bob to Alice. This is because Alice must pick her measurement basis randomly. 
She cannot know the order in which to choose her measurement bases in ad¬ 
vance, otherwise she would have just carried the classical information along 
with her. Instead, she must guess by flipping a coin prior to each measurement. 
Therefore, she picks the x-basis with probability 1/2 and the 2 -basis with prob¬ 
ability 1/2. Unfortunately, when she picks the wrong basis, her measurement is 
perfectly uncorrelated with Bob’s, so no information can be conveyed. 
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6.17.4. Nonlocality and Tests of Quantum Entanglement 

We have seen that entanglement between different components of a composite 
quantum system are capable of displaying strong correlations between the out¬ 
comes of measurements performed on separate components. In particular, we 
have been looking at states such as 


I^+)aB ~ I^)aB + 111) AS “ 


/ 0 \ 

1 

1 

Vo/ 


e Hab 


(6.473) 


in the composite Hilbert space Hab shared by both Alice and Bob. An essential 
point to remember, something we learned earlier, is that it is impossible to 
factorize this state 

I^+)ab ^ VI , a)a H’b)b (6.474) 

for any possible choice of states. In other words, there is no way for Alice and 
Bob to make this state independently (think of the joint Hilbert space 'Hab as 
being the set of states they can make together and the individual Hilbert spaces 
Ha and Hb as the sets of states that can be made independently. 


Indeed, this prediction of quantum mechanics is somewhat bizarre. It tells us 
that Alice and Bob working together have more power than they do working 
separately. The question we want to address now is whether or not it would 
be possible for Alice or Bob to mimic anything that behaves like quantum 
entanglement using classical physics. Classically, we would expect to have the 
following properties hold true 

1. Objective Reality meaning that even though Alice and Bob do not know the 
outcome of measurements they might perform on their physical systems, 
the particle itself knows what outcome it will produce when measured. 

2. Local Determinism meaning that if Alice and Bob have their particles in 
different space-like separated locations, anything Alice does should have 
no effect on what Bob measures and vice versa. In other words, Alice’s 
measurement outcome should be determined only by what she has in her 
lab with her. 


Bell Inequalities 

In order to test whether these classical assumptions are true, John Bell sug¬ 
gested the following type of experiment back in the 1960’s. Suppose that there 
is a pair source midway between Alice’s and Bob’s labs (which are distant from 
one another). That is to say, some device produces arbitrary (possibly random) 
states in the joint Hilbert space Hab and then sends one half of the pair to Alice 
and one half to Bob. Both Alice and Bob know when to expect these particles 
to arrive, but have no idea what state was made by the source. 
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Alice and Bob each have a device to measure(up or down) their incoming parti¬ 
cle along one of two different possible axes (directions). Therefore, think of the 
measurement devices each having a switch to select between the two different 
possible measurement directions and an output meter to indicate whether the 
measurement outcome is up or down along that direction. We will label the 
two different possible measurement directions by A,A 1 and B,B' (for example, 
A might mean to measure along the ^-direction and A' along the y-direction, 
and so on). 

Prior to each particle arrival, Alice and Bob each pick an independent direction 
to measure by setting their respective switches, wait for the measurement out¬ 
come and then record the result. They can repeat this process many times such 
that their measurement record might look something like: 


A 

-1 

B' 

-1 

A 

-1 

B 

+ 1 

A' 

+1 

B' 

+ 1 

A 

-1 

B 

-1 

A 

-1 

B' 

+ 1 

A! 

+1 

B 

-1 

A' 

-1 

B' 

+ 1 

A 

+1 

B 

+ 1 

A! 

-1 

B 

-1 

A 

+1 

B 

-1 


Table 6.1: Sample Data 


where we indicate up by +1 and down by -1. 

Given this structure for the experiment, let us return to our concepts of local 
determinism and objective reality. Well, objective reality suggests that each 
particle should “know” whether it will be up or down for both possible mea¬ 
surement settings. Local determinism suggests Alice’s measurement should be 
independent of the switch setting on Bob’s measurement instrument. 

Internal Information 

Under the combined assumptions of local determinism and objective reality, 
each particle must have the following information encoded into their internal 
states 

A n = ±1 A' n = ±1 B n = ±1 B' n = ± 1 (6.475) 

Each of the four possible measurement labels are treated as random variables, 
with the subscript indicating which round of the measurement. So, for example, 
A 4 = +1, would mean that if Alice measures along the direction A in round n = 4, 
she will get up. 
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Now, let us consider the following function of the random variables 

g-n - A n B n + A' n B n + A n B' n - A' n B' n (6.476) 

Now, were we to tabulate the value of g n for all possible choices of A n , A' nl B n , B' n , 
we would find that g. n = ±2 always. We can try this out for a set of possible 
values, for example, 


A n - +1, A' n - —1, B n - —1, B' n - -1 -»■ g n - -1 + 1 
In any case, we may write down the inequality 


1-1 =-2 


1 N 

N \ 9n 


l 

N 


N N N N 

E A ^ B n + E A 'nBn + E ~ E KK 

71 = 1 71 = 1 71=1 71=1 


< 2 


(6.477) 


(6.478) 


In other words, since the extrema of g n are ±2, then the average over many 
trials must be no larger than +2 and no smaller than -2. Thus, the absolute 
value of the average must be no greater than 2. 


We will prove all of these results in detail in Chapter 16. 


This inequality is one of several that are know as Bell inequalities (this form 
was developed by Clauser, Horne, Shimony and Holt). The true genius of such 
an inequality is that it provides us with a simple test, based only on probability 
theory, as to whether or not local determinism and objective reality are valid 
assumptions. 


Violations of Bell’s Inequality 


As you might have already guessed, quantum mechanics violates this inequality. 
For a simple example, let us consider the following scenario, which we will prove 
in detail in Chapter 16. Assume that the pair source produces the following state 


H’-)ab ~ It Dab I-I-'Oab “ 


/ 0 \ 

+ 1 
-1 

Vo/ 


(6.479) 


Also assume that Alice’s measurement A corresponds to measuring along the 
axes 


A = z , A' = cos cj>z + sin cj)x (6.480) 

while Bob, on the other hand, measures along the axes 

B = z , B' = cos(j)z - sin<^x (6.481) 


Now, all we have to do is compute the averages for all the terms in the inequality. 
We will learn how to do this for spin-1/2 particles and photon polarization in 
Chapters 7 and 9. For now, we just state the results 

i N i N 

V A nBn = -1 , ~ T, A 'nBn = -COSCj> (6.482) 

- /v 71=1 71=1 
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(6.483) 


1 N j Af 

— Y A n B 'n = -COS(t) , — XMn B n = COS2 (j) 

JV n=l JV n=l 

If we combine all of these results we find that 

1 N 

|<5n)| = jd E ffnl = I - l-2cos<^ + cos2</>| (6.484) 

We can now plot the expectation of g n (4>) and look for violations of the Bell 
inequality. 

2.5 


1.5 

s 

oF 


0.5 


0 0.5 1 1.5 2 2.5 3 3.5 

Figure 6.7: Quantum Mechanics versus the Bell Inequality 



Clearly, we see violation for a range of values of qb. 

We will have more to say about this interesting subject in Chapter 16. 

Finally, let us summarize once more some of the knowledge we have acquired 
about density operators. 


6.18. Expanding on the Density Operator and the 
Statistical Description of Quantum Systems 

In classical mechanics the state of a system is determined by a point in phase 
space. If we do not know the exact positions and momenta of all particles in the 
system, we need to use the probability density function to describe the system 
statistically 

In quantum mechanics the state of a system is characterized by a state vector 
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in Hilbert space that contains all the relevant information. To prepare a vector 
state for a quantum system at a given time, it suffices to perform a set of 
measurements on the system corresponding to a complete set of commuting 
observables. However, in practice such measurements are often impossible. The 
problem is how to incorporate the incompleteness of our information into the 
formalism of quantum mechanics in order to describe the state of such a system. 

Incompleteness can be formally described in two ways: 

(1) One way is to characterize the system as a member of a mixed ensemble 

in which we do not know the state vector of every member, but we know 
only the probability that an arbitrary member can be found in a specified 
state vector. For example, 70% of the members are characterized by \ip^) 
and the remaining by (these states are not necessarily orthogonal). 

We will investigate this approach later and show that such an ensemble 
cannot be characterized by an averaged state vector. 

(2) The second way in which a description of the system by a state vector is 
impossible is for systems that interact with others. Such systems are called 
open systems and are of particular interest in quantum optics. In such 
cases we are interested in only a part of the entire system. It is convenient 
to separate the system of primary interest from that of secondary interest 
and to call the former the system and the latter the reservoir. We can 
eliminate the reservoir by using the reduced density operator method as 
we will describe later. 


6.18.1. Statistical Description of Quantum Systems and the 
Nonexistence of an Averaged Quantum State 

Consider a mixed ensemble of similar systems such that our information about 
the state of the members is limited to the probability distribution over some 
specified state vectors of the system ||a), | (3) , |y),....} that are not necessarily 
orthogonal. For simplicity we assume that they are the eigenvectors of one of 
the observables of the system and form a complete basis for the Hilbert space. 

Let A be a Hermitian operator of the system with eigenvectors {|?n}} for m = 
1,2,...., N, where N is the dimension of the Hilbert space of the system. Assume 
that our information about the system is limited to the set of probabilities {P m } 
for finding the system in the rn th eigenvector. P m satisfies the conditions 

0 < P m < 1 , (to = 1,. ,N) 

N 

E p m = 1 (6-485) 

m= 1 
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It is reasonable to ask if the system can be described by a state vector. The 
state vector of such a system can be any vector such as 

\l’ (k) )=£V^e i + m \ m) (6.486) 

m=l 

where the phase <$£} can take on any real number in the interval (—7r, 7r). State 
vectors like Eq. (6.485) are called accessible states of the relevant system. The 
only parameters that discriminate one accessible state from another are the 
phases {4>m^}. Because of their arbitrariness, there is no preference between 
these states. So the system can be found in each of its accessible states with 
equal probability. Therefore, the probability distribution over the accessible 
states Eq. (6.485) is uniform, although the distribution over the complete basis 
{|to}} is nonuniform distribution like P m . 


If the number of accessible states in this ensemble is Q, the probability that an 
arbitrary element of the ensemble is in the state equals 1/f l (for every 

k = 1,..., O). So if we want to define an averaged quantum state for this system, 
we should multiply every accessible state by its corresponding probability and 
sum over all states: 

W-E> w > 

k= 1 52 

n i n ... 

= E A E -H 

k= 1 m= 1 

= (6-487) 

771=1 \ i2 fc=l / 


Because the number of accessible states is very large, we can assume that the 
(k) 

0m’s vary continuously. Then we can change the sum to an integration, that 


is 


1 

O 



k=1 



(6.488) 


Hence, the averaged quantum state is zero, 


i/j)=0 


(6.489) 


Therefore, such a system cannot be described by a state vector and we should 
look for another way to describe it. 


We can assign an operator \ip( k ^ (ip( k ^\ to every accessible state (6.485). Sim¬ 
ilar to Eq. (6.486), which defines the ensemble average of the accessible state 
vectors, we can define an ensemble average of these corresponding operators by 
introducing 

P= I 7^ (fc) }(^ (fc) | (6-490) 

k= 1 ^ 
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We will show that in spite of the zero average of the state vector, the average 
of p does not vanish. If we substitute Eq (6.485) into Eq. (6.489), we obtain 


n i / N 

k =1 ^ L \m=l 


o i(f> { 


(fc) 


w)(l^ ?l (n|) 


Es S |m)(, 


fc=l \n,m= 1 


(6.491) 


We separate the double sums in Eq. (6.490) into two separate sums so that the 
first one contains the terms with the same to and n, and the second one contains 
terms with different to and n and obtain 

« 1 / n \ 

P = E o E l n ) W 

fc=l “ \n=l / 

+ E H ( E |TO) <n|) (6.492) 

fc=l \n±m J 

In the first sum all the terms are the same. In the second sum we can exchange 

(k) 

the order of the sum over k with that over to and n and substitute d^n = 


P = E l n ) H + E 'J p mPn ( E |to) (n| (6.493) 

n=l ntm \k=l ^ / 


The second term in Eq. (6.492) vanishes because the sum over k can be changed 
by an integration over 9 mn just like Eq. (6.487) i.e., 



= 0 


Hence Eq. (9) reduces to 

N 

P = E p n \ n ) H (6.494) 

n= 1 

The operator p contains both quantum and classical statistical informa¬ 

tion (Q). Hence, all the necessary information for a quantum statistical descrip¬ 
tion of the system is contained in p. This idea can be clarified by calculating the 
ensemble average of an observable O of the system. For this purpose we mul¬ 
tiply its expectation value in every accessible state |by its corresponding 
probability, that is, 1 /Cl, and sum over all states: 

<0>= E^(0> fc = E^(k (fc) >|0||^ (fc) » (6-495) 

k= 1 k =1 
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If we substitute the state vector Eq. (6.485) into Eq. (6.494) and follow the 
same procedure that was done after Eq. (6), we find that 

(O) = £ ± ( £ yfP n e^ (n|) O ( £ |m)) 

k =1 \n =1 / \m= 1 / 

AT AT 

= £ P n (n\ O \n) = £ P n O nn (6.496) 

n= 1 n=1 

Therefore the ensemble average of an arbitrary operator can be evaluated by 
knowing the probability distribution {P n } and the complete basis {|n}}, without 
knowing the state vector. 

If we use the completeness of the basis {|n}}, we can write Eq. (6.495) as 

N N 

(O) = £ P n (n\ O \n) = £ P n (n\0\m) (m\n) 

n =1 m,n=l 

N N / N \ 

= £ Pn{m\n) (n\ O \m) = £ {m\ I £ P n \n) {n\ I O \m) 

m,n=l m= 1 \n=l / 

N 

= £ ( m\pO\m) (6.497) 

m= 1 

The right-hand side of Eq. (6.496) is the trace of the operator pO and can be 
rewritten as 

(O) = Tr(pO) (6.498) 

The density operator, Eq. (6.493), is sufficient to calculate every ensemble 
average of the operators. Therefore the role of the density operator in the 
statistical description of quantum systems is similar to the role of the probability 
distribution function in the statistical description of classical systems. 

6.18.2. Open Quantum Systems and the Reduced Density 
Operator 

Most physical systems of interest are not isolated but interact with other sys¬ 
tems. For example, an atom cannot be studied as an isolated system even if it is 
in a vacuum because it interacts with the vacuum state of the surrounding elec¬ 
tromagnetic held. This interaction is the source of many interesting phenomena 
such as spontaneous emission, the natural line width, the Lamb shift, and quan¬ 
tum noise. To study such a system we eliminate the degrees of freedom of the 
environment. This elimination leads to incompleteness of information about the 
system of interest so that the description of the system by a state vector is no 
longer possible. 

Suppose we are interested only in making measurements on a system S that 
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interacts with its environment R. The Hilbert space of the composite system 
S + R is the tensor product 

H = Hs <S> Hr (6.499) 

where Hs and Hr are the Hilbert spaces of the system and reservoir, respec¬ 
tively. 

Let Hs and Hr be the corresponding Hamiltonians, and {|s}} and {| r)} the 
energy eigenvectors of the system and its reservoir, respectively, in the absence 
of an interaction. Because the two systems are independent, their operators 
commute with each other: 

[H s ,H r ] = 0 (6.500) 

When they are put into contact, the total Hamiltonian is 

H = H s + H r + Vrs (6.501) 

where Vrs is the interaction between the two systems. If we prepare the com¬ 
posite system at the instant to hr a product state 

l*(*o)) = Msr) ® hM (6.502) 

then any measurement on S at this instant depends only on the state \ips) an d 
is independent of the state of the reservoir. 

As time evolves, the state of the composite system evolves according to the 
Schrodinger equation 

at 

where H is the total Hamiltonian as in Eq. (6.500). In general l'I'(f)} cannot be 
written in a product form such as Eq. (6.501) because the interaction energy 
Vrs depends on both systems. Such a state is called an entangled state. 

The question is how we can describe the state of S and make measurements 
on it. To answer this question, let us consider an observable of the system S 
such as As and calculate its expectation value when the state of the composite 
system is |\E'(f)). For this purpose we can use Eq. (6.497): 

(As) = TrsR(pAs) (6.504) 

where the trace is over the complete basis {|s,r}} of the space H, and p is the 
density operator of the composite system and is given by 

p(t) = l*(t)X*(t)l (6-505) 

Ag is the operator As in the composite space H, that is, 

As = As <8> Ir (6.506) 
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where Ir is the identity operator in the space T~Lr. 


A straightforward calculation then gives 


(As) = Trs(psAs) (6.507) 


where ps is the reduced density operator of the system S in the space T-Ls and 
is defined as 


PS = Tr s {p) 


(6.508) 


As it is seen from Eq. (6.506), we can calculate the expectation values of all 
observables of the system S in its own state space Hr by using its reduced den¬ 
sity operator. 

A necessary and sufficient condition for a system to be in a pure state is that the 
trace of the square of its density operator equals unity. But in general XV (p%) 
is not necessarily equal to unity even if the composite system is in a pure state. 
Therefore the state of an open system cannot generally be described by a state 
vector; instead it is described completely by the reduced density operator. 

The most complete information about a quantum system is contained in its 
state vector, but usually our information about the system is not sufficiently 
complete to determine the state vector of the system. The incompleteness of 
our information can be incorporated into the formalism of quantum mechanics 
in two ways. One way is to describe the system as a member of a mixed ensem¬ 
ble in which our information is limited to a probability distribution over some 
specified state vectors of the system. We showed that the ensemble average 
of the accessible states of the system is zero and the only way to describe the 
system is by the density operator. 

In the second case, we wish to make measurements on a part of a larger system. 
This part is considered to be an open system that interacts with the rest of 
the larger system (the environment or reservoir), which is not of interest to us. 
Even if we can determine the state vectors of the system and the reservoir at 
some instant of time, the interaction potential causes the composite system to 
evolve into another state in which the degrees of freedom of both systems are 
entangled so that we cannot separate the state of the system from that of the 
reservoir. In order to focus on the system of interest we eliminate the degrees 
of freedom of the reservoir. In this way we lose part of the information about 
the system that had been coded in the state of the composite system so that 
the description of the system by a state vector is no longer possible, and the 
system is described by the reduced density operator. 
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6.19. Problems 

6.19.1. Can It Be Written? 

Show that a density matrix p represents a state vector (i.e., it can be written 
as |0) (0| for some vector 1-0}) if, and only if, 


6.19.2. Pure and Nonpure States 

Consider an observable er that can only take on two values +1 or -1. The 
eigenvectors of the corresponding operator are denoted by |+) and |-}. Now 
consider the following states. 


(a) The one-parameter family of pure states that are represented by the vec¬ 
tors 


for arbitrary 6 . 


A 6 


|e) '72 l+>+ 55 | - > 


(b) The nonpure state 

p=|l+> <+l + |]-> <-l 


Show that (er) = 0 for both of these states. What, if any, are the physical 
differences between these various states, and how could they be measured? 


6.19.3. Probabilities 


Suppose the operator 


M = 


0 10 
1 0 1 
0 10 


represents an observable. Calculate the probability Prob(M = 0|p) for the fol¬ 
lowing state operators: 


■ 1 0 0 ■ 


r i o i 1 
2 2 


' 1 0 o ■ 

o 1 0 

, (b) P = 

0 0 0 

. (c) p = 

0 0 0 

0 0 i 


I o 1 

L 2 2 J 


1 

o 

o 

toil- 1 

1 __ 


6.19.4. Acceptable Density Operators 


Which of the following are acceptable as state operators? Find the correspond¬ 
ing state vectors for any of them that represent pure states. 


■ 1 3 

i i 

' 9 

12 ' 

i i 

II 

CM 

?§ 


.4 4 . 

I 1 

. 25 

25 . 
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P3 = 


■ 1 

2 

0 

1 ' 
4 


■ 1 

2 

0 

1 " 
4 

0 

1 

2 

0 

) P4 = 

0 

1 

4 

0 

1 

. 4 

0 

0 _ 


1 

. 4 

0 

1 

4 _ 


P 5 = ^ |u) (w| + \v) (v | + ^ |U> (u| + |U> (u\ 

(u | u) = (v | u) = 1 and (u \ v) = 0 


6.19.5. Is it a Density Matrix? 

Let pi and p 2 be a pair of density matrices. Show that 

P = rp\ + (1 -r)fe 

is a density matrix for all real numbers r such that 0 < r < 1. 


6.19.6. Unitary Operators 

An important class of operators are unitary, defined as those that preserve 
inner products, i.e., if \ip) = U\ip) and |t^} = U\<p), then {(p I ip'j = (ip \ ip) and 
(ii\<p) = {ip\ p). 

(a) Show that unitary operators satisfy UU + = U + U = I, i.e., the adjoint is 
the inverse. 

(b) Consider U = e lA , where A is a Hermitian operator. Show that U + = e~ lA 
and thus show that U is unitary. 

(c) Let U(t) = e~ lHt / h where t is time and H is the Hamiltonian. Let |-0(O)) 
be the state at time t = 0. Show that | ip(t)) = U(t ) IV’(O)) = e~' l ^ It ^ h IV'(O)) 
is a solution of the time-dependent Schrodinger equation, i.e., the state 
evolves according to a unitary map. Explain why this is required by the 
conservation of probability in non-relativistic quantum mechanics. 

(d) Let {K» be a complete set of energy eigenfunctions, H\u n ) = E n \u n ). 
Show that U(t) - Y, e~ lEnt ^ h \u n ) (u n |. Using this result, show that | ip(t)) = 

n 

Yc n e~ zEnt/h \u n ). What is c n ? 

n 


6.19.7. More Density Matrices 


Suppose we have a system with total angular momentum 1. Pick a basis corre¬ 
sponding to the three eigenvectors of the ^-component of the angular momen¬ 
tum, J z , with eigenvalues +1, 0, -1, respectively. We are given an ensemble of 
such systems described by the density matrix 


P = 


/ 2 
1 
1 


1 

1 

0 


1 

0 

1 
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(a) Is p a permissible density matrix? Give your reasoning. For the remainder 
of this problem, assume that it is permissible. Does it describe a pure or 
mixed state? Give your reasoning. 

(b) Given the ensemble described by p, what is the average value of J Z 1 

(c) What is the spread (standard deviation) in the measured values of J z ? 

6.19.8. Scale Transformation 

Space is invariant under the scale transformation 

f C 

x -+ x = e x 

where c is a parameter. The corresponding unitary operator may be written as 

U = e~ icb 

where D is the dilation generator. Determine the commutators [£), x~\ and 
[D,Pa;] between the generators of dilation and space displacements. Determine 
the operator D. Not all the laws of physics are invariant under dilation, so the 
symmetry is less common than displacements or rotations. You will need to use 
the identity in Problem 6.19.11. 

6.19.9. Operator Properties 

(a) Prove that if It is a Hermitian operator, then U = e lH is a unitary operator. 

(b) Show that det U = e iTrH . 

6.19.10. An Instantaneous Boost 

The unitary operator 

U(v) = 

describes the instantaneous (t - 0) effect of a transformation to a frame of 
reference moving at the velocity v with respect to the original reference frame. 
Its effects on the velocity and position operators are: 

uVu - 1 = V -vi , = 

Find an operator G t such that the unitary operator U(v,t) = e lv ' Gt will yield 
the full Galilean transformation 

uVu - 1 = V -vi , uQu - 1 = Q-vtI 

Verify that Gt satisfies the same commutation relation with P, J and H as does 

G. 
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6.19.11. A Very Useful Identity 


Prove the following identity, in which A and B are operators and a: is a param¬ 
eter. 


e xA Be~ xA = B + [A, B] x + [A, [i, B]] y + [A, [A, [A, B]]] j + . 

There is a clever way (see Problem 6.12 below if you are having difficulty) to do 
this problem using ODEs and not just brute-force multiplying everything out. 


6.19.12. A Very Useful Identity with some help.... 

The operator U(a) = e lpa ^ h is a translation operator in space (here we consider 
only one dimension). To see this we need to prove the identity 

OO -1 

e A Be~ A = X! f [A [A,...[A,B ]....]] 

o n -' -,-- ■—v—■ 

n n 

= B + [A, B] + — [A, [A, B]] + —[A [A, [A, B]]] +. 

(a) Consider B(t) = e tA Be~ tA , where t is a real parameter. Show that 

±B(t) = e tA [A,B]e~ tA 
at 

(b) Obviously, 5(0) = B and therefore 

B(1) = B + f 1 dt—B(t) 

Jo dt 

Now using the power series B(t) = Y,n=o UB n and using the above integral 
expression, show that B n = [A, B„_i]/n. 

(c) Show by induction that 

B n = -.[A, [A,...[A,B ]....]] 
n\ v_ v _x . y , 

n n 

(d) Use B(l) = e A Be~ A and prove the identity. 

(e) Now prove e lpa ^ h xe~ wa ^ h = x + a showing that U(a) indeed translates 
space. 
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6.19.13. Another Very Useful Identity 

Prove that 

e A + B = e A e B e -\[A,B] 

provided that the operators A and B satisfy 

[A,[A,b]] = [b,[A,b]] = o 

A clever solution uses Problem 6.19.11or 6.19.12 result and ODEs. 

6.19.14. Pure to Nonpure? 

Use the equation of motion for the density operator p to show that a pure state 
cannot evolve into a nonpure state and vice versa. 


6.19.15. Schur’s Lemma 

Let G be the space of complex differentiable test functions, g(x), where x is 
real. It is convenient to extend G slightly to encompass all functions, g(x), such 
that g(x) = g(x) + c, where g e G and c is any constant. Let us call the extended 
space G. Let q and p be linear operators on G such that 


qg(x) = xg{x) 


pg(x) = -i 


dg(x) 


dx 


-ig'(x) 


Suppose M is a linear operator on G that commutes with q and p. Show that 

(1) q and p are hermitian on G 

(2) M is a constant multiple of the identity operator 


6.19.16. More About the Density Operator 

Let us try to improve our understanding of the density matrix formalism and the 
connections with information or entropy. We consider a simple two-state system. 
Let p be any general density matrix operating on the two-dimensional Hilbert 
space of this system. 

(a) Calculate the entropy, s = -Tr(p In p) corresponding to this density ma¬ 
trix. Express the result in terms of a single real parameter. Make a clear 
interpretation of this parameter and specify its range. 

(b) Make a graph of the entropy as a function of the parameter. What is 
the entropy for a pure state? Interpret your graph in terms of knowledge 
about a system taken from an ensemble with density matrix p. 
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(c) Consider a system with ensemble p a mixture of two ensembles p\ and p 2 '. 
p - dpi + (1 - 0)p- 2 , 0 <9<1 


As an example, suppose 



in some basis. Prove that 

s(p) >p = 6»s(pi) + (1 - d)s(p 2 ) 

with equality if 6 = 0 or 6 = 1. This the so-called von Neumann’s mixing 
theorem. 


6.19.17. Entanglement and the Purity of a Reduced Density 
Operator 

Let Na and Nb be a pair of two-dimensional Hilbert spaces with given or¬ 
thonormal bases { | 0 ^ 4 ), |1a)} and {|0 b} , |ls)}. Let I’I'ab} be the state 

| ^ab) = cos6> |0 a) <8> |0 B ) + sin6* |1 a) <8> |1 B ) 

For 0 < 9 < 7t/ 2, this is an entangled state. The purity £ of the reduced density 
operator p a = Tr b[\$>ab) ab\] given by 

C = Tr[p \] 


is a good measure of the entanglement of states in Nab- For pure states of the 
above form, find extrema of ( with respect to 9 (0 < 6 < 7t/2). Do entangled 
states have large ( or small (? 

6.19.18. The Controlled-Not Operator 

Again let Na and Nb be a pair of two-dimensional Hilbert spaces with given 
orthonormal bases {|0a),|1a)} and {|0b),|1b)}. Consider the controlled-not 
operator on Nab (very important in quantum computing), 

Uab = Pq <S> I b + Ft ® erf 

where Pf = |0a) (0a|, P\ = |1a) (1a| and erf = |0 B ) (1 B | + |ls> <0 0 |. 

Write a matrix representation for Uab with respect to the following (ordered) 
basis for Nab 


|0a)®|0b) i |0 a)®|1b) , |1a)®|0b) , |1a)®|1b) 

Find the eigenvectors of Uab - you should be able to do this by inspection. Do 
any of them correspond to entangled states? 
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6.19.19. Creating Entanglement via Unitary Evolution 

Working with the same system as in Problems 6.17 and 6.18, find a factorizable 
input state 

such that the output state 

\*ab) = Uab\*% 3 ) 

is maximally entangled. That is, find any factorizable 'l™/?) such that Tr[p\] = 
1/2, where 

PA = Tr B [\V?*){9?*\] 


6.19.20. Tensor-Product Bases 


Let Ha and Hb be a pair of two-dimensional Hilbert spaces with given orthonor¬ 
mal bases ||0 A ), |1 A )} and (|0b) , |1_b)}. Consider the following entangled state 
in the joint Hilbert space Hab = Ha ®Hb, 


I’I'ab) 


1 

71 


(|0 a 1b) + |1a0_b)) 


where |0 a 1b) is short-hand notation for |0 A ) ® |1b) and so on. Rewrite this 
state in terms of a new basis {|0 a 0b) , |0 a 1b} , |1 A 0b) , |1 A 1b)}, where 


\0 a) = cos ^ |0 A ) + sin ^ |l A ) 

|l a) = - sin ^ |0 A ) + cos | |l A ) 

and similarly for (|0 b) , |1b)}- Again |0 a 0b} = |0 a) ® |0b), etc. Is our particular 
choice of |T A b) special in some way? 


6.19.21. Matrix Representations 

Let Ha and Hb be a pair of two-dimensional Hilbert spaces with given orthonor¬ 
mal bases {|0 A ),|1 A }} and {|0b),|1b}}- Let |0 a 0b) = |0 A ) ®> |0b), etc. Let the 
natural tensor product basis kets for the joint space Hab be represented by 
column vectors as follows: 


0 a 0b) 

O O 

_ 

, 0 a 1b) 

/°\ 

1 

0 

, |1 A 0b) «-*• 

l-^OO 

, |1a1b) ++ 

o o o 


\o} 


w 




u/ 


For parts (a) -(c), let 

Pab = - |0 A ) (0 A | ®> - (| 0 b) + | 1 b» (( 0 b| + ( 1 -bI) 

+ | |1 A > (1 A | «»| (|0b> - |1^» (<0b| - <1b|) 
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(a) Find the matrix representation of pab that corresponds to the above vec¬ 
tor representation of the basis kets. 

(b) Find the matrix representation of the partial projectors I A ® P B and 
I A ®P B (see problem 6.19.18 for definitions) and then use them to compute 
the matrix representation of 

(I A ® P 0 B ) pab ( I A ®P 0 B ) + ( I A ® P? ) Pab ( I A ® P B ) 


(c) Find the matrix representation of pa - Ti'b\pab] by taking the partial 
trace using Dirac language methods. 

6.19.22. Practice with Dirac Language for Joint Systems 

Let Pa and Pg be a pair of two-dimensional Hilbert spaces with given or¬ 
thonormal bases {|CU), |lyi}} and {|0b),|1b)}. Let |0a0b) = \0a) ® |0b), etc. 
Consider the joint state 


I’J'tIb) 


1 

71 


(|0a0_b) + |1a1s)) 


(a) For this particular joint state, find the most general form of an observable 
O a acting only on the A subsystem such that 

(*ab\O a ® I B |^ B ) = (*ab\ ( I A ® P B ) O a ® I B ( I A ® P 0 B ) |*ab> 

where 

P<f = |0 B )(0 B | 

Express your answer in Dirac language. 

(b) Consider the specific operator 

= |0 A )(l A | + |l A )(0 A | 

which satisfies the general form you should have found in part (a). Find 
the most general form of the joint state vector \^'ab) such that 

(*ab\ xA ® I B Wab) * {*ab\ ( I A ® P 0 B ) X A ® I B ( I A ® P B ) |*ab> 


(c) Find an example of a reduced density matrix pA for the A subsystem such 
that no joint state vector of the general form you found in part (b) 

can satisfy 

pA = Tr B [\9' AB ){*' A B\\ 
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6.19.23. More Mixed States 


Let Ha and Hb be a pair of two-dimensional Hilbert spaces with given or¬ 
thonormal bases ||0 a), |1a}} and {\0b) , |1b}}- Let |0 a0b) = |0a) ® |0_b), etc. 
Suppose that both the A and B subsystems are initially under your control and 
you prepare the initial joint state 

1^%} = (|0a0b) + |1a1b)) 

(a) Suppose you take the A and B systems prepared in the state | ) an d 

give them to your friend, who then performs the following procedure. Your 
friend flips a biased coin with probability p for heads; if the result of the 
coin-flip is a head (probability p ) the result of the procedure performed by 
your friend is the state 

-4(|0„0 6 )-|1„1 6 » 

and if the result is a tail(probability 1 - p) the result of the procedure 
performed by your friend is the state 

y^(|0 a 0 b ) + |l a lb}) 

i.e., nothing happened. After this procedure what is the density operator 
you should use to represent your knowledge of the joint state? 

(b) Suppose you take the A and B systems prepared in the state |'I' < 4 b) and 
give them to your friend, who then performs the alternate procedure. Your 
friend performs a measurement of the observable 

o = i A ®u h 

but does not tell you the result. After this procedure, what density opera¬ 
tor should you use to represent your knowledge of the joint state? Assume 
that you can use the projection postulate (reduction) for state condition¬ 
ing (preparation). 

6.19.24. Complete Sets of Commuting Observables 

Consider a three-dimensional Hilbert space Hs and the following set of opera¬ 
tors: 

(1 1 0\ /I 0 0\ /0 0 0\ 

O a ++ 1 0 0 , O f 3^0 1 0 ,o 7 ~o 1 0 

^0 0 0/ \0 0 0/ \0 0 1 ; 

Find all possible complete sets of commuting observables(CSCO). That is, de¬ 
termine whether or not each of the sets 

{O q }, {Op}, {O^}, { O a , 0/3 }, { O a , 0 7 ), {Op, 0 7 ), {O a , 0/3, o 7 ) 

constitutes a valid CSCO. 
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6.19.25. Conserved Quantum Numbers 

Determine which of the CSCO’s in problem 6.24 (if any) are conserved by the 
Schrodinger equation with Hamiltonian 

(2 1 0 \ 

H - £0 1 1 0 = £o ( O a } + {Ofs) 

\0 0 0 / 
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Chapter 7 


How Does It really Work: 

Photons, K-Mesons and Stern-Gerlach 


7.1. Introduction 

Many experiments indicate that electromagnetic waves have a vector property 
called polarization. Suppose that we have an electromagnetic wave (we will just 
say light from now on) passing through a piece of Polaroid material. The Po¬ 
laroid material has the property that it lets through only that light whose polar¬ 
ization vector is oriented parallel to a preferred direction in the Polaroid(called 
the optic axis). 

Classically, if an incident beam of light is polarized parallel to the optic axis, 
then experiment says that all of its energy gets through the Polaroid. 

If, on the other hand, it is polarized perpendicular to the optic axis, then ex¬ 
periment says that none of its energy gets through the Polaroid. 

Classically, if it is polarized at an angle a to the optic axis, then experiment 
says that a fraction cos 2 a of its energy gets through the Polaroid. 

Many experiments, including several that we will discuss later, indicate that 
light (and everything else in the universe as it turns out) exhibits both "particle¬ 
like" and "wave-like" properties. (We will define both of these terms more 
carefully later and decide then if this simple statement makes any sense). 

The particle associated with light is called a photon. 

Can the experimental polarization/Polaroid results be reconciled with this idea 
of a particle-like photon? 

Suppose we assume that a light wave that is polarized in a certain direction is 
made up of a large number of photons each of which is polarized in that same 
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direction. 


The particle properties of light, as represented by photons, invariably lead to 
some confusion. It is not possible to eliminate all of this confusion at this el¬ 
ementary discussion level because a satisfactory treatment of photons requires 
quantum electrodynamics. 

We can, however, make many of the more important physical properties clear. 

Consider a simple representation of a monochromatic electromagnetic wave (and 
its associated photons) with angular frequency ui and wavelength A moving in 
a direction given by the unit vector k. Such a monochromatic electromagnetic 
wave is composed of N (a very large number) photons, each with energy E and 
momentum veep, such that we have the relationships 

E = hu> , p=hk=—k (7-1) 

A 

where k is the wave vector, h = h/2 tt, oj = 2irf, h = Planck’s constant, / = 
frequency = c/A, and c = speed of light. We note that 

E = = hf = h^~ = pc (7.2) 

Z7T A 

as required by relativity for a particle with zero mass (such as the photon). 

The number of photons in the wave is such that the total energy of the N 
photons, NE = Nhuj, is equal to the total energy W in the electromagnetic 
wave, i.e., W = NE = Nhui. 

Here, we are using the fact, derived from many experiments, that the energy 
of the light wave is quantized and thus can only take on certain discrete values 
(its value is a multiple of some quantum of energy). 

When we specify the polarization of light, we are actually giving the direction 
of the electric field vector E. Light waves are generally represented by plane 
electromagnetic waves. This means that the electric field vector E and the 
magnetic field vector B are both perpendicular to the direction of propagation 
specified by k. If we choose the direction of propagation to be the 2 -axis, which 
is specified by the unit vector e 2 , then E and B lie in the x - y plane. Since we 
are only considering the polarization property at this point, we can concentrate 
on the E vector alone. Now, any vector in the x - y plane can be specified in 
terms of a pairs of orthonormal vectors (called the basis) in that plane. The 
vectors in these orthonormal directions are called the polarization vectors. 

Two standard sets of orthonormal polarization vectors are often chosen when 
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one discusses polarization. One of the two sets is 



(A 


/°\ 

&X — 

o o 
_ 

> _ 

1 

(oj 


(7.3) 


which correspond to plane(or linearly )-polarized waves. A second set is 


e-R = e+ = 


n/2 


/1 \ 

i 

W 


e L = e_ = 


n/2 


1\ 

-i 

0 / 


(7.4) 


which correspond to circularly polarized waves. 


For classical electromagnetic fields, a light wave propagating in the ^-direction 
is usually described(illustrating with the two orthonormal sets) by electric field 
vectors of the forms given below. 


Plane-polarized basis: 

E{f,t) = 

Circularly-polarized basis: 


(E x (r,t)\ 
E(r,t ) = E x (r,t) 


'E x (r,t/ 



(0) 

E x (r,t ) 

= E x (r,t) 

0 

+ E y (r,t ) 1 

o ) 


loj 

w 


(7.5) 


— E x (r, t)e x + Ey(v ^t^Cy 


(7.6) 


\ o ) 

E x (r,t) + iE y (r,t) 1 




10/ 


E x (r,t)-iE y (f,t) 1 


n/2 


n/2 


y/2 s/2 

= E x (r,t)e R + E y (r,t)e L 

By convention, we represent the field components by 
E x (r,t ) = E° x e i(kz - ut+a ^ and E y (r,t) = 


1 \ 

-i 

\0 


(7.7) 


where a x and a y are the (real)phases and E x and E y are the (real) amplitudes 
of the electric field components. 

Clearly, the polarization state of the light is directly related to the E vectors in 
this formulation. 


For example, using the above equations, we have these cases: 
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1. If Ey = 0, then the wave is plane polarized in the x-direction 


, „ I 1 ] 

E = E x e x = E x 0 

W 


2. If E x = 0, then the wave is plane polarized in the y-direction 


E — EyCy ~ Ey 


( 0 \ 

1 

W 


3. If E x = E y , then the wave is plane polarized at 45° 


E — E x e x + Ey€y — E x 


(\\ 

1 

W 


(7.8) 


(7.9) 


(7.10) 


4. If E y = -iE x = e l7T /‘ 2 E x , then the y-component lags the ^-component by 
90° and the wave is right circularly polarized 


E — E x en — E x 


/1 \ 

i 

\o/ 


(7.11) 


5. If E y = iE x - e^^Ex, then the y-component leads the ^-component by 
90° and the wave is left circularly polarized 


E — E x eL — E x 


1 \ 

-i 

w 


(7.12) 


For our present discussion, the above properties of polarization are sufficient. 
We will give a more detailed discussion of the quantum mechanics of photon 
polarization shortly. 


The picture we are proposing assumes that each photon has the same polariza¬ 
tion as the light wave, which is, in fact, verified by experiment. 


This simple experimental property leads to some fundamental difficulties for 
classical mechanics. 


If the incident beam is polarized parallel or perpendicular to the optic axis of a 

Polaroid, then classical physics has no problems. all the photons (and thus 

all the energy) either pass through or do not pass (and thus none of the energy) 
through the Polaroid. 
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But what about the case where the wave is polarized at 45 ° to the optic axis of 
the Polaroid? 

For the beam as a whole, the experimental result is that 1/2 (cos 2 45° = 1/2) of 
the total energy and hence 1/2 of the photons pass through. 

But what about any particular photon, each of which is polarized at 45 ° to the 
optic axis? 

Now the answer is not clear at all and the fundamental dilemma of the sub¬ 
atomic world rears its ugly head. 

As will become clear during our discussions of quantum mechanics, this question 
about what will happen to a particular photon under certain conditions is not 
very precise. 

In order for any theory to make clear predictions about experiments, we will 
have to learn how to ask very precise questions. We must also remember that 
only questions about the results of experiments have a real significance in physics 
and it is only such questions that theoretical physics must consider. 

All relevant questions and the subsequent experiments devised to answer the 
questions must be clear and precise, however. 

In this case, we can make the question clear by doing the experiment with a 
beam containing only one photon and observe what happens after it arrives at 
the Polaroid. In particular, we make a simple observation to see whether or not 
it passes through the Polaroid. 

The most important experimental result is that this single photon either passes 
through the Polaroid or it does not. I will call this type of experiment a go-nogo 
or yes-no experiment. 

We never observe 1/2 the energy of a single photon. We always observe either 
zero energy or an energy exactly equal to hw. One never observes a part of a 
photon passing through and a part getting absorbed in the Polaroid. 

In addition, if a photon gets through, then experiment says that its polarization 
vector changes such that it ends up polarized in a direction parallel to the optic 
axis of this particular Polaroid (instead of at 45 ° with respect to that axis as it 
was polarized beforehand). 

In a beam of N photons, each photon will independently behave as the single 
photon did in the description above. No experiment can determine which of the 
photons will pass through and which will not, even though they are all identi¬ 
cal. In each experiment, however, exactly 1/2 of the total energy and 1/2 of the 
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photons will pass through the 45 ° Polaroid. 

As we shall show later, the only way this result can be interpreted is to say that 
each photon has a probability = 1/2 of passing through the 45° Polaroid. 

We are led to this probabilistic point of view because the energy of the elec¬ 
tromagnetic wave is quantized (or equivalently, that the electromagnetic wave 
is made up of photons) and we cannot have fractions of the energy quantum 
appearing during an experiment. 

We have managed to preserve the indivisibility of the photons(the quantization 
of their energy), but we were able to do this only by abandoning the comforting 
determinacy of classical physics and introducing probability. 

The results in this experiment are not completely determined by the exper¬ 
imental conditions(initial) under control of the experimenter, as they would 
have been according to classical ideas. 

As we shall see, the most that we will be able to predict in any experiment is a 
set of possible results, with a probability of occurrence for each. 

The experiment described above involving a single photon polarized at an angle 
to the optic axis, allows us to ask only one type of experimental and theoretical 
question, namely, does the photon go through or is it absorbed - only a yes-no 
experiment? That will turn out to be the only legitimate question we can ask 
in this case. 

It is the first indication of the way we should frame our discussion of theory! 

We shall see that questions like.... 

What decides whether a particular photon goes through? 

When does a particular photon decide whether it will pass through? 

How does a particular photon change its polarization direction? 

cannot be answered by experiment and, therefore, must be regarded as outside 
the domain of quantum theory and possibly all of physics. 

What will our theory of quantum mechanics say about the state of the single 
photon? 

It will be shown that the photon polarized at an angle to the optic axis is in 
a very special kind of state that we will call a superposition of being polarized 
perpendicular to the optic axis and of being polarized parallel to the optic axis, 
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i.e., a superposition of all the possibilities. 

In this state, there will exist an extraordinary kind of relationship between the 
two kinds (mutually perpendicular directions) of polarization. 

The meaning of the word superposition will follow clearly from the mathemat¬ 
ical formalism and language we have developed in this book. It will, however, 
require a new physical connection to mathematics. 

This, as we shall see later, is suggested by an attempt to express the mean¬ 
ing of superposition in ordinary language (words). If we attempt to explain the 
behavior of the photon polarized at an angle to the optic axis using ordinary 
language, then we might be inclined to say something like this: 

The photon is 

not polarized parallel to the optic axis 
not polarized perpendicular to the optic axis 
not simultaneously possessing both polarizations 
not possessing neither polarization 

For this experiment with only two possible polarizations, these statements ex¬ 
haust all the logical possibilities allowed by ordinary wordsl 

Superposition is something completely different than any of the above and it is 
not all of the above. 

Its physical content will, however, be precise and clear in our new mathematical 
formalism. 

When the photon encounters the Polaroid, we are observing it. We are ob¬ 
serving whether it is polarized perpendicular or parallel to the optic axis of the 
Polaroid. The effect of this measurement will be to end up with the photon hav¬ 
ing one or the other polarizations (the one we measure). In such a measurement, 
the photon always makes a jump from a state of superposition to a state of a 
definite polarization. Which of the two states it jumps to cannot be predicted. 
We will, as we shall see, be able to predict the probability of each possibility. 

If it jumps into the parallel state, it has passed through. If it jumps into the 
perpendicular state, it has been absorbed. 

We will have a great deal more to say about the two new words, superposition 
and jump, as we proceed. Do not attach any classical meaning to the word jump 
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as used here. It simply my attempt to use words for the moment. 

7.1.1. Photon Interference 

Another classic experiment involving light waves and hence photons is two-slit 
interference. The ideas of superposition and jump will reappear in this case and 
give us another view to enhance our understanding (or maybe confusion) at this 
point. 

We will discuss this type of experiment in great detail and in many different 
forms in this book. As we shall see, the meaning of this experiment is easily 
misunderstood. 

This experiment looks at the position and momentum properties of photons 
instead of the polarization properties. 

For an approximately monochromatic light wave we have some knowledge of 
both the position and momentum of the photons. In particular, the photons 
must be located near the beam of light and their momentum is in the direction 
of propagation of the beam and has an approximate magnitude 

E huj hf h 

l>~ ~— • ' = , (7-13) 

C C C A 

According to the standard wave interference description, a two-slit interference 
experiment consists of a device which, in some manner, splits the incoming 
beam into two beams which are sent over different physical paths and then re¬ 
combined, in some way, so that they can interfere and produce a distinctive 
pattern on a screen as we discussed in Chapters 1-3. 

To see the problems directly, we again consider an incident beam consisting of 
a single photon. What happens as it passes through the apparatus? 

The photon has only two classical possibilities if it is going to reach the screen 
and contribute to the interference pattern, i.e., it must follow a path that passes 
through one of the two slits. 

As in the polarization experiment, the correct quantum description will involve 
describing the single photon as a superposition of photons traveling on (at least) 
two different paths to the same point on a screen. 

Once again, as we shall see in detail later on, if we tried to use ordinary lan¬ 
guage to describe the experimental results in the interference experiment we 
find ourselves saying that the photon is 
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not on path #1 
not on path #2 

not simultaneously on both paths 
not on neither path 

What actually happens when we try to determine the energy of a photon in one 
of the beams? 

When we do the measurement, we always find the photon(all of the energy in 
the single photon system) in one of the beams. When we observe the photon it 
must somehow jump from being in a superposition (which generates the inter¬ 
ference pattern) to being entirely in one of the beams (which, as it turns out, 
does not generate the interference pattern). One possibility is that the jump is 
caused by the measurement of the energy in this case. 

Although, we cannot predict on which path the photon will be found, we can, 
however, predict the probability of either result from the mathematical formal¬ 
ism we will develop for superposition. 

Even more striking is the following result. Suppose we have a single beam of 
light consisting of a large number of photons, which we split up into two beams 
of equal intensity. On the average we should have about half of the photons in 
each beam. 

Now make the two beams interfere. 

If we were to assume that the interference is the result of a photon in one beam 
interfering with a photon in the other beam, then we would have to allow two 
photons to annihilate each other some of the time (producing interference min¬ 
ima) and sometimes turn into four photons (producing interference maxima). 
This process clearly violates conservation of energy. 

The new concept of superposition will enable us to deal with this problem. Each 
photon in the original beam will be in a superposition of being in both beams. 
The superposition of possibilities will produce the interference effect. Interfer¬ 
ence between two different photons or of the photon with itself is never required 
and energy will always be conserved. 

We will have a great deal more to say about these and other experiments as we 
develop the mathematical formalism of quantum mechanics. 
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Some questions about this approach 

Is it really necessary to introduce the new concepts of superposition 
and jump? 

In many simple experiments, the classical model of waves and photons connected 
in some vague statistical way will be sufficient to explain many experimental re¬ 
sults. The new quantum ideas do not contribute anything new in these cases. 
As we shall see as we progress through this chapter, there are, however, an 
overwhelming number of experiments where the only correct explanation comes 
from a quantum theory built around these new concepts and their associated 
mathematical formalism. 

Will this new theory give us a better model of the photon and of 
single photon processes? 

I do not believe that the object of a physical theory is to provide a satisfying 
model or picture, but instead, to develop a language that enables us to formu¬ 
late physical laws and make experimental predictions. 

Models and pictures are holdovers from classical physics as applied to macro¬ 
scopic phenomena. In the case of atomic phenomena and beyond we cannot 
expect to find an appropriate model or picture. All such models and pictures 
rely on classical concepts and these concepts are totally inadequate in the mi¬ 
croscopic arena. Their main goal seems to be to make the reader feel more com¬ 
fortable with the new and strange phenomena under discussion. Their many 
misleading implications, however, often cause misunderstandings that lead many 
a student down dead-end paths as they try to understand quantum mechanics. 
We will avoid all non-mathematical models. 

Unless one has mastered the correct language and adjusted one’s mode of think¬ 
ing, it will not, in general, be possible to understand modern theoretical physics. 
For me, a self-consistent mathematical formulation of the theory that uses a 
language appropriate to the physical phenomena involved and is able to make 
correct predictions for all experiments constitutes a valid physical model. 

The reasons for observed physical behavior are contained in the mathemati¬ 
cal formalism. There is no deeper physics involved beyond the mathematical 
formalism. The only model is the mathematical formalism and its associated 
physical connections along with a language that is needed to understand it and 
express what it means in terms of experiments. 

What about determinism? 

The problems that the concept of superposition presents to the classical idea 
of determinism when we create a well-defined system are so devastating that 
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these old ideas must be abandoned. We will discuss these ideas in great detail 
throughout the book, but for now a short digression about states, superposition 
and measurements seems appropriate. 

A classical state is specified by giving numerical values for all the coordinates 
and velocities of the constituents of the system at some instant of time. The 
subsequent motion is then determined if we know the force laws. 

For a small system, however, we cannot observe it and determine its properties 
to the extent required for a complete classical description. Thus, any micro¬ 
scopic system is necessarily specified by fewer or more indefinite data than all 
the coordinates and velocities at some instant of time. 

In the new mathematical formalism, we will carefully define the kind of infor¬ 
mation we can know about a state, which information we can actually know 
at a given instant of time and how we will prepare such states. The prepared 
states will have definite values of a specific set of measurable quantities. 

In terms of such states, we will define a superposition such that we have well- 
defined mathematical relationships between the different states making up the 
superposition. The mathematical definition of a superposition will not be de- 
scribable with the classical ideas or pictures or models available, i.e., it will turn 
out that when we superpose two states, the nature of the relationship that exists 
between them cannot be explained in terms of classical physical concepts. The 
system is not partly in one state and partly in the other in any sense. 

We will have to deal with a completely new idea here. 

During the course of our discussions, we will need to get accustomed to the 
mathematical formalism behind the concept of superposition and we will need 
to rely on the formalism and the mathematics, as expressed by the Dirac lan¬ 
guage, without having any detailed classical models. 

The new superposed state will be completely defined by the states involved, the 
mathematical relationships between them and the physical meaning of those 
mathematical relationships as defined by the formalism. 

As we saw in the polarization and interference experiments, we must have a 
superposition of two possible states, say A and B. Suppose that if the system 
were in state A alone, then a particular measurement would, with certainty, 
produce the result a and if the system were in state B alone, then the same 
measurement would, with certainty, produce the result b. Then, if the system 
is in the superposed state(of A and B), it turns out that the same measure¬ 
ment will sometimes produce the result a and sometimes the result b. Given 
the mathematical definition of the superposed state and the rules that we will 
specify to relate the physics to the mathematics, we will be able to predict the 
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probability of getting the result a and the probability of getting the result b. 

We will not be able to predict which result we will get in any one measurement, 
i.e., the measurement process will not be deterministic. Identical measurements 
on identically prepared states will not yield identical results in any one mea¬ 
surement. 

If we repeat the measurements a large number of times, however, we can predict 
the fraction of time we will obtain the result . That is all we will be able to 
predict using the formalism of quantum mechanics we will develop. I firmly 
believe, that nature is such that we will not, under any circumstances, be able 
to make any more detailed predictions for quantum systems. 

So, summarizing the situation before we proceed. 

When physicists try to construct a theory from their experimental observations 
they are primarily concerned with two things: 

how to calculate something that will enable them to 
predict the results of further experiments 

how to understand what is going on in the experiments 

Now, it is not always possible to satisfy both of these concerns at the same 
time. Sometimes we have a reasonably clear idea of what is going on, but the 
mathematical details are so complex that calculating is very hard. Sometimes 
the mathematics is quite clear, but the understanding is difficult. 

Quantum mechanics falls into the latter category. 

Over the years since quantum mechanics was first proposed, a very pragmatic 
attitude has formed. 

Physicists realized that an understanding of how electrons could behave in such 
a manner would come once they developed the rules that would enable them to 
calculate the way they behaved. 

So quantum mechanics as a set of rules was developed, as we shall see in this 
book. 

It has been tested in a large number of experiments and never been found to be 
wrong. It works extremely well. 

Yet, we are still not sure how the electrons can behave in such a manner. 
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In many ways, the situation has even gotten worse over the years. Some people, 
as we shall see, held out the hope that quantum mechanics was an incomplete 
theory and that as we did more experiments we would discover some loose end 
or new idea that would allow us to make sense of things. 

This has not happened and I believe it never will. 

We now carry out the details of a special case that will illustrate how Quantum 
Mechanics works and also illustrate the mathematical formalism that we have 
developed in earlier chapters. 


7.2. Photon Polarization 


As we mentioned earlier, the electric field vector E of plane electromagnetic 
waves lies in a plane perpendicular to the direction of propagation of the wave. 
If we choose the 2 -axis as the direction of propagation, we can represent the 
electric field vector as a 2-dimensional vector in the x — y plane. This means 
that we will only require two numbers to describe the electric field. Since the 
polarization state of the light is directly related to the electric field vector, this 
means that we can also represent the polarization states of the photons by 2- 
component column vectors or ket vectors of the form 


h/>) = 



where we assume the normalization condition [if) \ ijj) = 1 


(7.14) 


The components in the state vectors depend only on the actual polarization 
state of the photon. The state vector contains all of the information that we 
can have about the state of polarization of the photon (remember that it consists 
of just two numbers). 


Examples 


\x) = 

I y) = 




I L) = 


|45>= M 


1 

71 


■* linear polarized photon 
-> linear polarized photon 
right circular polarized photon 
»left circular polarized photon 
photon linearly polarized at 45 ° to the x-axis 


(-V) 
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The bra vector or linear functional corresponding the ket vector | ip) is given by 
the row vector 

w=(r x r y ) (7.i5) 

which clearly implies via our inner product rules 

(V’lvMV’x ^)(^) = l^l 2 + l^l 2 = 1 ( 7 - 16 ) 


In general, for 

\<t>) = 

( AaA 

yfry) 

(7.17) 

the inner product rule says that 




(<P\^) = (K <t >* v ) 

/byA 

Vl’y) 

= <t>li>x + 4>*y$y = (V'l A}* 

(7.18) 


We also have the results 


(x | x) = 1 = (y | y) and (x \ y) = 0 = (y \ x) -*■ orthonormal set (7-19) 

(R | R) = 1 = (L | L) and (R \ L) = 0 = (L \ R) -*■ orthonormal set (7.20) 

Each of these two sets of state vectors is a basis for the 2-dimensional vector 
space of polarization states since any other state vector can be written as a 
linear combination of them, i.e., 


or 


IVA 



ip x \x)+ ip y \y) 



ipx - iip y (l\ ip x + ij’y ( 1 \ 

2 2 \"V 


Ipx - ilpy 

n/2 


\R) + 


Ipx + ifpy 

~7T~ 


I L) 


We can find the components along the basis vectors using 


(7.21) 


(7.22) 


(x I Ip) = {x\ (ipx |x) + ipy I y)) = ip x (x I x) + ipy (x I y) = ip x (7.23) 

(y\i>) = (y\(^x \x) + ip y | y)) = ipx{y\x)+ v (y I y) = *P V (7-24) 


or 


and similarly 


\ip) = \x)(x\ip) + \y)(y\ip) 
\iP) = \R)(R\iP) + \L)(L\iP) 


(7.25) 

(7.26) 


Basically, we are illustrating examples of a superposition principle, which says 
that any arbitrary polarization state can be written as a superposition (linear 
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combination) of x- and y-polarization states or equivalently, as a superposition 
of right- and left-circularly polarized states. 

Our earlier discussions of a beam of light passing through a polaroid can now 
be recast in terms of these polarization states. 

Classical physics says that the beam is a superposition of an x-polarized beam 
and a y-polarized beam and when this beam passes through an x-polaroid, its 
effect is to remove the y-polarized beam and pass the a;-polarized beam through 
unchanged. 

The energy of the classical beam is given by \E\ 2 which for the polarization states 
is proportional to \4) x \ 2 + iV’yl 2 - Thus, the beam energy after passing through 
an ;r-polaroid is proportional to |^a;| 2 . The fraction of the beam energy or the 
fraction of the number of photons in the beam that passes through is given by 

i d^- fc ' 2 -i< I ' ,/ ’>i 2 (7 - 27) 

for states normalized to 1. Our earlier discussion for the case of a single photon 
forced us to set this quantity equal to the probability of a single photon in the 
state | ip) passing through an ;r-polaroid or 

probability of a photon in the state l^) 
passing through an a;-polaroid = | (x \ ip) \ 2 

This agrees with the mathematical results we derived in Chapter 6. 

This all makes sense in an ensemble interpretation of the state vector, that is, 
that the state vector represents the beam or equivalently, many copies of a single 
photon system. 

We then define {x\ip) as the probability amplitude for the individual photon to 
pass through the x-polaroid. 

Another example confirming these results is light passing through a prism. 
A prism passes right-circularly-polarized (RCP) light and rejects (absorbs) left- 
circularly-polarized(LCP) light. 

Since we can write 

ty) = \R){R\il>) + \L){L\1>) (7.28) 

we can generalize the polaroid result to say 

amplitude that a photon in the state \ip) 
passes through a prism = (R \ip) 
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and 


probability that a photon in the state \ip) 
passes through a prism = | (R \ ip) | 2 

Polaroids and prisms are examples of go-nogo devices. Certain photons are 
passed through while others are absorbed in these devices. 

We note that the probability is independent of the phase of the state, but 
the amplitude depends on phase. This will be a crucial point later on in our 
discussion. 

7.2.1. How Many Basis Sets ? 

We have already seen two examples of basis sets for the 2-dimensional vector 
space of polarization states, namely, 

(I 3 -) > ll/)} and {\R ), \L)} (7.29) 

In the 2-dimensional vector space there are an infinite number of such orthonor¬ 
mal basis sets related to the {|a;}, |j/)} set. They are all equivalent for describing 
physical systems (they correspond to different orientations of the polaroid in 
the experimental measurement). We can obtain the other sets say {|a/}, \y')}, 
by a rotation of the bases (or axes) as shown in Figure 7.1 below. 




Figure 7.1: Rotation of Axes 


We then have in the x - y basis 

\i>) = \x) + ipy I y) 



and if we choose to use the equivalent x' - y' basis we have 


Vl>) =il>x’\x') + il>y'\y') 



(7.30) 


(7.31) 
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How are these components related to each other? We have 


W = \x){x\tp) + \y)(y\ip) 

(7.32) 

which implies 

{x' | i>) = {x' \x){x\ ip) + {x 1 1 y) (y | ip) 

(7.33) 

or in matrix notation 


( y ' 1 i>) = (y' \x){x\ip) + (y' \y)(y\ ip) 

(7.34) 

or in matrix notation 


({x'\ i')\ _ {(x^x) (x 1 1 y)\ _((x\ip)\ 

\(y'\x) (y'\y)) \(y\ip)J 

(7.35) 

So we can transform the basis (transform the components) if we 
the 2x2 transformation matrix 

can determine 


({x'\x) (x'|j/)\ 

W\x) {y'\y)) 


(7.36) 


It turns out that this result is quite general in the sense that it holds for any 
two bases, not just the linearly polarized bases we used to derive it. 


For the linear(plane) polarized case, we can think of an analogy to unit vectors 
along the axes in ordinary space as shown on the right in Figure 7.1. 


Then we have (by analogy) 


e-x • e-x' = cos# = (x 1 \ x) , e x > ■ e y = sin# = (x' \ y) 

e y ’ ■ e y - cos# - (y' \ y) , e y > ■ e x = -sin0 = (/1 x) 
or 

\x') = {x\ x') \x) + (y\ x') | y) = cos0|cc) + sin# \y) 

I y') = (x\y')\x) + (y \y')\y) = - sin# \x) + cos# \y) 
and the transformation matrix, R(0), is 


so that 


m 


((x'\x) (x'\y) 

(y'\y) 




There are two equivalent ways to interpret these results. 


(7.37) 

(7.38) 

(7.39) 

(7.40) 


(7.41) 


(7.42) 


First, we could say it tells us the components of | ip) in the rotated basis (we 
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keep the vector fixed and rotate the axes). Second, we can rotate the vector and 
keep the axes fixed(rotate in the opposite direction). In this case, we regard 



(7.43) 


as a new vector \ip') whose components in the fixed x - y basis are the same as 
the components of | ip) in the x' - j/'-basis or 

(x 1 1 ip) = (x I ip') and (y 1 \ ip) = (y \ ip [ ) (7.44) 

For real i p x and ■ ip y , | ip') is the vector | ip) rotated clockwise by 9 or, regarding 
R(9) as a linear operator in the vector space we have 

W) = R(0) W) (7.45) 


It is not a Hermitian operator, however, so it cannot represent an observable. It 
is a transformation of vectors and according to our earlier discussion in Chapter 
6, it should be a unitary operator. We can see this as follows: 

R-\9) = R(-9) = " c S o “ 0 ) = R t {6) = R\6) (7.46) 

Since R(9) is a unitary transformation operator for rotations, our earlier dis¬ 
cussions in Chapter 6 say that we should be able to express it as an exponential 
operator involving the angular momentum with respect to the axis of rotation 
( 2 -axis) J z , of the form 

R(9) = e wJ ‘ /h (7.47) 

Now, we can rewrite R(d) as 

( cos 6 siniA „/l 0\ . . „ /0 -i\ 

^ = (-sin0 cos0j =COS ^O lj + * Sm0 ( ? ; oj (7 - 48) 

= cos 91 + i sin 9Q 

where the physical meaning of the operator 

«-(( o) I 7 - 49 ) 


is yet to be determined. 
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Expanding (7.47) in a power series we have 


R(9) = m + ^ dm 


i! de 

d 3 R{6) 


e+ ld 2 R(e) 


3! d9 3 


0=0 


0=0 


2! dO 2 


0=0 


9 3 + I d A R{0) 


4! de 4 


6> 4 + ... 


0=0 


1! 


_ e iejz/h 


w + + gift'd gift'd «* 

! \ ft ) 2\ \ h ) 3\\ h ) 4 \ \ h ) 


(7.50) 


Now let us assume that hQ = j z , where J z is defined in (7.47). We then have 
the following algebraic results. 


J 2 = h 2 Q 2 = {°. 0 * 


)(“ o)-(o (7 - 51) 


which implies that 

j 3 = h 2 J z , J 4 = ft 2 /? = ft 4 / , J 5 Z = ft 4 J z and so on 
This gives 

, (i) 2 I n2 , /jz a 3 , (i) 4 i 


(7.52) 


R(e) = e ieJ *' h = i + (i)|fl + + 4! 


(9 4 + ... 


= 11 - — + — 
2! 4! 


y - ... j + i {9 - race 3 3! + ...) 


= cos ei + i sin ehJ z = cos 61 + i sin 6Q 


(7.53) 


as in (7.48). 

So everything, exponential form of the rotation operator and matrix form of the 
rotation operator, is consistent if we choose 

J z = hQ = h^. “*J (7.54) 

This corresponds to the matrix representation of J z in the{|a;), |y}} basis. 

We now work out the eigenvectors and eigenvalues of R(6) as given by the 
equation 

R{0)\1>) = c\1>) (7.55) 

where c = the eigenvalue corresponding to the eigenvector \ip). 

Since all vectors are eigenvectors of the identity operator I, we only need to find 
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the eigenvectors and eigenvalues of J z in order to solve the problem for R(0), 
i-e.j 

R(0) lip) = cos 8 I\ip) + isinBQ I ip) = cos9\ip) + \ sin 9J Z \ip) = clip) (7.56) 

ft. 

If we let (since R(9) and J z commute they have common eigenvectors) 

Jz |^> = A|0> (7.57) 


then we have 


R{9) | ip) = cos 81 1 ip) + isindQ \ip) = cos0|^} + — sin0J z \ip) = c| 

h 


or 


/i ^ ■ n 

c = cos 0 + — sin 0 
h 


Now, since J z = ft 2 /, we have 


which says that 


J 2 |V>> = A 2 |V>> = ft 2 /b/>} = ft 2 h/'} 


A 2 = ft 2 or A = ±ft = eigenvalues of J z 


(7.58) 

(7.59) 

(7.60) 

(7.61) 


We can find the corresponding eigenvectors by inserting the eigenvalues into the 
eigenvalue equation 


We assume that 


to get from (7.62) 


J z | Jz = ft) = ft | Jz = ft) 


| J z = ft) = , where |a| 2 + |6| 2 = 1 


hl °i o U )= h [™) = h Ui 


ia ) 


(7.62) 

(7.63) 

(7.64) 


This gives the result ia = b which together with the normalization condition says 
that o=l/\/2. We have arbitrarily chosen a to be real since only the relative 
phase between components is important. This then gives b = i/\/2. Finally, we 
have the eigenvector 




(7.65) 


= = ^ L,H£> 




(7.66) 
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Similarly, we get 



So the eigenvectors of J z and hence of R{9) are the RCP and LCP basis states. 
We then have 

R(9 ) | R) = ^cos 91 + ^ sin0J, j | R) 

= (cos0 + isin0) \R) 

= e ie | R) (7.67) 

Similarly, 

R(9) | L) = e~ iB | L) (7.68) 

This agrees with our earlier discussion in Chapter 4 where we found that the 
eigenvalues of a unitary operator have complex exponential form. 

Physically, this says that |i?) and | L) are only changed by an overall phase factor 
under a rotation of the basis. This allows us to easily specify what happens to 
an arbitrary vector l^) under rotations. 

The procedure we will use next is, as we will see over and over again, the 
standard way to do things in quantum mechanics. First, we expand the arbitrary 
vector in the {| R ), | L)} basis 

\1>) = \R){R\1>) + \L)(L\1>) (7.69) 

We then apply the rotation operator to obtain 

R(9) \i/>) = R{9) | R) {R\if) + R(9) \L) (L | f,) (7.70) 

= eW \R)m) + e~' 9 \L)(L\iP) 

or the RCP component of the vector is multiplied by the phase factor e l 6 and 
the LCP component of the vector is multiplied by a different phase factor e~ lB . 
Thus, rotations change the relative phase of the components, which is a real 
physical change (as opposed to an overall phase change of the state vector). 

We interpret (reasons will be clear later) these results to say that the RCP 
photon is in a state which is an eigenvector of J z with eigenvalue +h or that the 
photon in that state has 2 -component of spin = +h. Similarly, a LCP photon 
has ^-component of spin = -h. 

Now, it is an experimental fact that if a photon traveling in the 2 -direction is 
absorbed by matter, then the 2 -component of the angular momentum of the 
absorber increases by h or decreases by h. It never remains the same, nor does 
it change by any value other than ±h. 

One cannot predict, for any single photon, whether the change will be +h or —h. 
We can, however, predict the probability of either value occurring. In particular, 
according to our probability formalism, we must have 

\(R\i/j)\ 2 = probability of +h , |(L|^)| 2 = probability of - h (7.71) 
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and the average value of the ^-component of the angular momentum is 

(J z ) = X! (eigenvalue) x (probability of the eigenvalue) (7.72) 

all possibilities 

or 

(j z ) = +h\(R\^)\ 2 -h\(L\i>)\ 2 (7.73) 

In general, a photon is neither pure RCP nor pure LCP and the angular mo¬ 
mentum does not have a definite value. 


We can still talk in terms of probabilities, however. The discreteness of the 
angular momentum spectrum forces a probabilistic interpretation on us. 

We can easily see how all of this works using our mathematical formalism for 
average values as follows: 

( Jz) = (V'l Jz W 

= m R) (R\ + {'I’\L) ( L\)U\R) (R\tp) + | L) (L I V>» 

= (my (r\+(LM* (L\)j z (\R)m)+\ l > < liv >» 

= {R\Jz \R) \{R\ip)\ 2 + (L\ J z \L) \{L\il>) I 2 

+ (R\ J z | L) (R | (R\*P) + {L\ Jz I R) (L | (R | 

= +h\m)\ 2 -h\(L\^)\ 2 (7.74) 

as we showed earlier(7.73). 


Let us return for a moment to the matrix representation of the J z operator. We 
have found the following results: 

J z \R) = +h | R) and J z \L) = -h \L) (7.75) 

In the {| R ), | L)} basis, these relations imply the matrix representation 


J, = 


(R\J Z \R) (R\J Z \L) 
(L\ J z | R) (L\ J z | L) 


)-(i -,) 


(7.76) 


which is the standard form of J z in terms of one of the so-called Pauli matrices, 
namely, 


G") 


Jz = ha~ 


Now 


|*}=40R> + I£»and \ y ) = ±zz{\R)-\L)) 


\/2 ' ' 72 

and, therefore, in the (|x), |j/}} basis we have the matrix representation 


Jz = 


(x\ J z |x) (x\ J z I y) 

{y\J z \x) {y\J z \y) 


H o) 


(7.77) 

(7.78) 

(7.79) 


which is the form we guessed and used earlier. 
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7.2.2. Projection Operators 

Let us now turn our attention to projection operators and density operators in 
the context of photon polarization. 


The general operator \ip) (</>| can be represented by a 2 x 2 matrix in the polar¬ 
ization state vector space. It is constructed using the outer product rule: 


P = W)W 



(7.80) 


or equivalently, by choosing a basis and finding the matrix representation 


p = ({x\P\x) (x\P\y)\ = ({x\ip){(j>\x) {x\i’\{</)\y)\ 
\(y| P\ x ) {y\P\y)J \ (j/MI#e) {y\i>\\4>\y) ) 

= / ^x4>* v \ 
yl’vK ‘•Pvtlj 

In particular, we have for the projection operators 

= o) ’ \ x Hv\ = § o) 

|y)W = (j o) ’ ly)^l = (o 5) 

From these results we easily see that 

|z) {x\ + \y) (y\ = fj = i 


(7.81) 


(7.82) 


\i/j)=I\'ip) = (\x){x\ + \y){y\)\i>) 
= \x)(x\tp) + \y)(y\ip) = 

as they should. Similarly, we have 

I R) (R\ + 1 L) (L\ = i 


(7.83) 


(7.84) 


which leads to 


J Z = JJ=J Z (\R)(R\ + \L)(L\) (7.85) 

= h\R)(R\-h\L)(L\ (7.86) 


which is the expansion of the operator J z in terms of eigenvalues and 1-dimensional 
subspace projection operators (eigenvectors) that we discussed earlier. 
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7.2.3. Amplitudes and Probabilities 

The probability interpretation we have been making follows from the concept 
of superposition. The superposition idea says that we can write any arbitrary 
photon state as a linear combination of basis states 

\'I>) = \R){R\'I>) + \L) {L\ip) (7.87) 

We then interpreted | (R\if) | 2 as the probability that the photon in the state 
|0) will behave as a RCP photon in the state |i?}. 

Generalizing this statement, we say that a system in a state \ip), in Quantum 
Mechanics, has a probability | (0|0) | 2 of behaving like it was in the state \<p). 

You might now conclude, from the experimental fact that only ±h is transferred 
to matter by light, that photons are always either in the state |i?) with some 
probability a or in the state 1 1) with probability 1 - a. 

FACT: An x-polarized photon never passes through a y-polaroid 

PROBLEM: If, the above interpretation of being either |i?) or | L) was true, 
then 

1. an ^-polarized photon has a probability | (R\x) | 2 = 1/2 of being RCP and 
a RCP photon has a probability |(r/|i?)j 2 = 1/2 of being a y-polarized 
photon and thus passing through a y-polaroid. 

2. an x-polarized photon has a probability | (L \ x) | 2 = 1/2 of being LCP and 
a LCP photon has a probability |( 2 /|i)| 2 = 1/2 of being a y-polarized 
photon and thus passing through a y-polaroid. 

This means that if we assume we can think that the photon is either |i?) or | L) 
but we do not know which, i.e., photon properties have an objective reality, then 
the total probability that an x-polarized photon would get through a y-polaroid 
in this interpretation is 

total probability = | {R\ x) | 2 | {y \ R) | 2 + | (L \ x) | 2 | {y \ L) | 2 = ^ (7.88) 

However, as we stated, it NEVER HAPPENS. What is wrong? 

SOLUTION: When we think of an a;-polarized photon as being a RCP photon 
or a LCP photon with equal probability, we are ruling out the possibility of any 
interference between the RCP and LCP amplitudes. We are thinking classicallyl 

We give meaning to the word interference here in this way. 

The correct calculation of the probability, which lays the foundation for all of 
the amplitude mechanics rules in Quantum Mechanics, goes as follows: 
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1. The probability amplitude of an x-polarized photon passing through a 
y-polaroid = (y\x) = 0, which implies that the probability | (z/1 or) | 2 = 0 
also. 

2. If we say that the x-polarized photon is in a superposition of | R) and | L) 
(we make no statement about probabilities at this point), this implies that 

\x) = \R)(R\x) + \L)(L\x) (7.89) 

which gives 

(y I x) = {y I R) (R\x) + (y\ L) (L \ x) (7.90) 

or the amplitude for an x-polarized photon to pass through a y-polaroid is 
the sum of two amplitudes , namely, that it passes through as a RCP photon 
(y | R) (R | x) and that it passes through as a LCP photon (y \L) (L\x). 

3. The probability of passing through is then the absolute square of the total 
amplitude 

probability = | (y \ R) (R \ x) + (y \ L) (L \ x) | 2 

= ((y\R)* (R\x)* + (y\L)* (L\xy)((y\R){R\x) + {y\L){L\x)) 

= \(R\x)\ 2 \(y\R)\ 2 + \(L\x)\ 2 \(y\L)\ 2 

(y\R)(R\ x) (y \ L)* (L \ x>* + (y \ R)* (R\x)* (y \L) (L\x) 

4. The first two terms are the same as the incorrect calculation (7.88) above. 
The last two terms represent interference effects between the two ampli¬ 
tudes (RCP way and LCP way). 

A simple calculation shows that the interference terms exactly cancel the first 
two terms and that the probability equals zero in agreement with experiment! 

INTERPRETATION: The way to interpret this result is as follows: 

(y | R) (R | x) = probability amplitude for an x-polarized photon to 
pass through a y-polaroid as a RCP photon 


(y\L) (L\x) = probability amplitude for an x-polarized photon to 
pass through a y-polaroid as a LCP photon 

These are indistinguishable ways for the process to occur, i.e., no measurement 
exists that can tell us whether it passes through the system as an RCP photon 
or as a LCP photon without destroying the interference,i.e., without radically 
altering the experiment. 

To get the correct total probability, we add all the amplitudes for indistinguish¬ 
able ways and then square the resulting total amplitude. 
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In the incorrect calculation, we found the probability for each indistinguishable 
way and then added the probabilities. 


In one case, we eliminated the interference effects and got the wrong result and, 
in the other case, we included the interference effects and obtained the correct 
result. 

Summarizing, we have these rules for amplitude mechanics and probabilities in 
Quantum Mechanics: 

1. The probability amplitude for two successive events is the product of the 
amplitudes for each event, i.e., the amplitude for the x-polarized photon 
to pass through the y-polaroid as a RCP polarized photon is the product 
of the amplitude for an .T-polarized photon to be a RCP photon (R\x) 
and the amplitude for a RCP photon to be a y-polarized photon ( y \ R) 

(R\x){y\R) 

2. The total amplitude for a process that can take place in several indis¬ 
tinguishable ways is the sum of the amplitudes for each individual way, 
i.e., 

(y\x) = (R\x) (y\R) + (L\x)(y\L) 

We note here that this is merely a reflection of the property of projection 
operators that 

i = \R)(R\ + \L)(L\ 

which says that 

(y I x) = (y\I \x) = (R\x) (y\R) + (L\ x) (y \ L) 

Thus, the mathematical sum over all projection operators being 
equal to the identity operator is physically equivalent to the 
sum over all possible intermediate states and it turns into a 
sum over all the amplitudes for indistinguishable ways in this 
interpretation. 

3. The total probability for the process to occur is the absolute square of the 
total amplitude. 

So, in classical physics, we 

1. find amplitudes and probabilities of each way separately 

2. add all probabilities to get total probability 
We get NO interference effects!! 

In Quantum Mechanics, we 
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1. find the amplitudes for each indistinguishable way the process can occur 

2. add all the amplitudes to get a total amplitude 

3. square the total amplitude to get the total probability 
We get interference effects!! 

The important result here is that we must consider ALL INDISTINGUISH¬ 
ABLE WAYS in step (2). 

An indistinguishable way is characterized as follows: 

1. If two ways are indistinguishable, then there exists no measurement that 
can decide which of the two ways actually happened without altering the 
experiment. 

2. In particular, if we attempt to find out, then the interference effects will 
disappear and we will return to the classical result obtained by adding 
probabilities. 

What actually happens is that during any measurement trying distinguish the 
ways, the relative phase of the components in the superposition becomes com¬ 
pletely uncertain and this will wash out the interference. This happens as fol¬ 
lows: instead of 

\x) = \R){R\x) + \L)(L\x) (7.91) 

we would have, if we attempted to add a measurement to determine if the 
rr-polarized photon was RCP or LCP, to consider the state 

|i) = e iaR | R) (R\x) + e iaL \L) {L | x) (7.92) 

A probability calculation then gives 

total probability = | (y \ R) (R \ x) + (y \ L) (L \ x) | 2 (7.93) 

+ 2 Real [( y \ R) {R \ x) e i(aR - aL) {y \ L)* {L | a;)*] 

The observed probability, which is the result of many identical measurements 
in the laboratory (in the standard interpretation), is an average over all values 
of the extra phases(they are random in different experiments). 

This involves integrating over the relative phase,i.e., 

_L f 2 V«*-^)d( ai ,- ai ) = 0 (7.94) 

27t Jo 

It is clear that the interference term averages to zero and we get the classical 
result! This means we cannot add such a measurement and retain any quantum 
results! 
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7.2.4. Pure States, Unpure States and Density Operators 

If the photon were in the state |x), then we would have, for some linear operator 

A 

(A) = (x\A\x) (7.95) 

From our earlier discussion of density operators, however, we must also have, 

(A) = Tr(W A) (7.96) 

where IF is a density operator. Therefore we must have (using the { |a?), |j/)} 
basis) 

(A) = {x\A\x) = Tr(W A) = {x\ W A\x) + (y\ W A\y) 

= (x\ wiA |x) + (y\ wiA | y) 

= (x\ W |x) {x\ A \x) + (x| W | y) (y\ A\x) 

+ (y\ W\x) (x\ A I y) + (y\ W \y) (y\ A \y) 


This implies that 

{x\ A \x) = 1 and (y\ A \x) = ( x\ A \y) = (y\ A \y) = 0 (7.97) 


or 

W' = (o J) = WW (798) 

which says that |a’) is a pure state. 


Now suppose that the photon is in the state 


= —^ \x) + ~^= j y) 

V2 V2 


(7.99) 


This says that the probability = 1/2 that the photon behaves like |a;} and the 
probability = 1/2 that it behaves like | y). Note that the relative phase between 
the components is assumed to be known exactly in this state. In this case, we 
have 

(A) = (V>| A (0) = i [(x| A \x) + (x\ A\y) + (y\ A \x) + {y\ A |y)] 

= Tr(WA) = {x\ WA |x) + {y\ WA \y) = (x| WiA \x) + (y\ WiA \y) 

= (x\ W |x) (x| A |x) + (x| W | y) (y\ A\x) 

+ (y\ W | y) {x\ A |x) + (y\ W \y) (y\ A \y) 


which implies that 

{x\W\x) = i = (y\W\y) = (y\W\x) = (x\W\y) (7.100) 
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or 


(7.101) 


'H(i H(:) (i iH,>>w 

So, again we have a pure state. 

But what happens if we only know that the probability = 1/2 that the photon 
behaves like |x) and the probability = 1/2 that it behaves like | y). This says that 
we might write the state vector as 

\4>)=a\ x ) + b\y) (7.102) 

where we only know that |a| 2 = |6| 2 = 1/2. Let us choose 

pia a piab 

a = -=- and b = —= (7.103) 

\/2 \/2 

We do not have any phase information in this case. In addition, the phases 
values could be different in each separate experiment. This last fact means that 
we must average over the relative phase (the only meaningful phase) a a - ab 
when computing the probabilities and this means that all interference effects 
will vanish. 

When we calculate the expectation value we have 

(A) = (ip\A\ip) 

= i [(*| A lx) + (x\ i |y> + (y\ A \x) + (y\ i |y)] 

and when we average over the relative phase we obtain 

(i>= ^(x\A\x) + ^(y\A\y) 

Again, we must have 

{A) = Tr(WA) = {x\ WA|x) + (y\ W A\y) = (x| WlA\x) + (y\WlA\y) 

= {x\ W |x) (x| A \x) + (x| W | y) (y\ A \x) 

+ (y\ W I y) {x\ A |x) + (y\ W \ y) (y\ A \y) 

which implies that 

{x\ W |x) = ^ = (y\W | y) and (y\ W \x) = (x| W\y) = 0 
or 

5 j = ^ I*) (*| + i \y) (y\ 

= Prob{x) \x) (x\ + Prob(y) |z/) (y\ 
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This is a nonpure or mixed state. 

So, we have a pure state only if the relative phase information is known exactly. 

The way to describe a nonpure state is by the corresponding density matrix, 
which only requires knowing probabilities and not phases. It really does not 
have a state vector. 


7.2.5. Unpolarized Light 

Consider the following experiment. We have a beam of monochromatic light 
that is composed of photons from two sources which output photons in the 
states |f/>i) or | fa), respectively. The sources emit the photons randomly and 
are independent of each other, which implies that we cannot tell which source 
any particular photon comes from. 

We assign these probabilities 

Pi = probability that a photon comes from source ffl 
P 2 = probability that a photon comes from source ff2 

where p\ +P 2 = 1. Now the probability that a particular observed photon trans¬ 
fers h is 

P+ = Pl |(i?|^i>| 2 + p 2 |(i?|^>| 2 (7.104) 

and the probability that it transfers -ft is 

V- =Pi\{L\ipi)\ 2 +p 2 \{L\ if 2 ) \ 2 (7.105) 

This implies that the average value of the angular momentum transfer for the 
beam of photons is 

(J z ) = hp+ - hp_ 

= hpi\ (R\ </q) | 2 + ftp 2 | (R\ V» 2 ) | 2 - ftpi| (L | Vh> | 2 - hp 2 \ (L | V> 2 ) | 2 
= pi [ft| (i?|Vh>l 2 - h\ (L | Vh> | 2 ] + P 2 [ft| (R | V> 2 > | 2 -h\(L\fa) | 2 ] 

= Pl{Jz)l + P2(Jz)-2 

or, the average value of the angular momentum transfer for the beam of photons 
= sum over the average value in each beam weighted by the probability that 
photon comes from that beam. 

Let me emphasize(once again) at this point that it is important to realize that 
the statement 

The photon is either in the state |^i) or |'</; 2 ) but we do not know which 
is NOT the same statement as 
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The photon is in a state which is a superposition of \xf\) and ^ 2 ) 

In the second case, we are saying the relative phase is known as in the state 

IV>) = 4= |s> + -4 \ y ) (7.106) 

which we found to be a pure state. Being in a superposition implies that we 
know the relative phase of the components. 

In the first case, however, we are saying that the relative phase is unknown and, 
as we have seen, interference effects will vanish. We can only specify a density 
matrix in this case. 

In pure states, we have superpositions and the probability amplitude rules ap¬ 
ply. In nonpure or mixed states, where the system is in one of several states 
with definite probabilities, we find weighted averages (weighted with the state 
probabilities) of the value in each state. We use addition of probabilities with 
no interference effects, which as we have seen, is equivalent to saying the relative 
phase is unknown. 

Unpolarized light has equal probability of being in any polarization state. It is 
just a special nonpure or mixed state. No relative phase information is known 
for unpolarized light. 


7.2.6. How Does the Polarization State Vector Change? 

Up to now we have been considering devices such as polaroids and prisms, which 
are go-nogo devices. Some photons get through and some do not for each of 
these devices depending on their polarization state. 

We now consider devices where all the photons get through no matter what 
their polarization state is, but, during transit, the device changes the incident 
polarization state in some way. 

In particular, we consider the example of a birefringent crystal, such as calcite. 
A calcite crystal has a preferred direction called the optic axis. The crystal has 
a different index of refraction for light polarized parallel to the optic axis than it 
has for light polarized perpendicular to the optic axis. We assume that the optic 
axis is in the x-y plane and send a beam of photons in the z-direction. Photons 
polarized perpendicular to the optic axis are called ordinary and are in the state 
|o) and photons polarized parallel to the optic axis are called extraordinary and 
are in the state |e). 

The set {|o),|e)| forms an orthonormal basis and general photon states inter¬ 
acting with a calcite crystal are written as superpositions of these basis states. 
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This is an example of a general rule in quantum mechanics. 

If we are doing an experiment using a particular measuring device that measures 
the observable Q, then we should use as the basis for all states, the eigenvectors 
of Q. As we shall see, this requirement pushes us to ask the correct experimental 
questions (those that quantum mechanics can answer). This particular basis is 
called the home space for the experiment. 

Now the phase of a light wave with wavelength A as it propagates 
medium in the z-direction is given by the quantity 

± 

(p = e 

with 

2n mo 
A c 

where n = index of refraction, omega = 2irf, / = frequency and c 
light. 

Since the phase depends on the index of refraction, the effect of passing through 
a calcite crystal is to change the relative phase of the |() o) and \e) components 
making up the superposition, which is a real physical change that is measurable. 

We assume that the photon entering the calcite crystal is in the initial state 

I i’in) = |e) (e|f pi n ) + |o) (o\ip in ) (7.109) 

The two components have different indices of refraction n e and n 0 , respectively. 

If the beam passes through a length £ then the state upon leaving is given by 
(remember the component phases change differently) 

\4><mt) = e* fce<? |e) (e| V’in) + e lkoi \o) (o\ip in ) = Ue\ipi n ) (7.110) 

where 

Uz = e ik ° z |e> (e\ + e ik ° z |o> (o| (7.111) 

is a time development operator of some sort since t distance traveled in a time 
t is proportional to t. 

Now we define two new quantities which will be with us throughout our study 
of Quantum Mechanics. For transitions between two states (in and out in this 
case) 

(<j> | ipout) = (4 >I U z | tpin) - the transition amplitude for a photon to enter 

the calcite in state | ip in ) and leave in state \<j>) 

\(4>\ Vw| 2 ) = | (<j>\ U z | ip in) 1 2 = the corresponding transition probability 


through a 

(7.107) 

(7.108) 
= speed of 
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To proceed any further, we need to find out more about the operator U z . Now 
|'0 Z ) = state of photon after traveling a distance z through calcite (7.112) 
= U 2 | Ipin ) 

From the form of U z we have 

U z+e = e ifce(z+e) |e) (e| + e lfe ° (z+e) |o> (o| (7.113) 

= (e^ e |e) (e| + e ife ° £ |o) (o|)(e ife ' z |e) (e| + e ife ° z |o) (o|) 

or 

U z+ e = (7.114) 

This implies that 

Wz+c) = U z+ e Win) = UeUz Win) = Ue Wz) (7.115) 

Now let e 0 such that k„e « 1 and k e e « 1 and we can then write (to l st -order) 

U e = e ik ° e \e) (e| + e ik ° £ |o) (o| (7.116) 

= (1 + ik e e ) |e) (e| + (1 + ik 0 e) |o) (o| 

= J + ieiF 

where 

/ = |e) (e| + |o) (o| and K - k e \e) (e\ + k 0 |o) (o| (7.117) 

Now, the relation 

K = k e \e) (e\ + k 0 \o) (o\ (7.118) 

is an expansion of an operator in terms of its eigenvalues and the corresponding 
projection operators (eigenvectors). It says that the eigenvectors of K are |e) 
and |o) with eigenvalues k e and k 0 , respectively. This illustrates the awesome 
power in these methods we have developed!! 

We then have 


W z +e) = (I + iek) \ip z ) 

(7.119) 

W z +e) ~\ip z )= iek\ij> z ) 

(7.120) 

i im w„,)-i *>) , ii<#z) 

e->-0 e 

(7.121) 


which gives the differential equation for the time development of the state vector 

^Wz)=iKW z ) (7.122) 

at 

It is clearly similar to the differential equation we obtained earlier(Chapter 6) 
from the time development operator. If we follow the results from the earlier 
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case, then we should have 


I\ = Hermitian operator and U z = unitary operator 
Let us derive these two results. We have, using the x-y basis 

(x\ipz+e)- (x\ip z ) = ie{x\K\tp z ) (7.123) 

= ie(x\ki\4> z ) = ie{x\k\x){x\i> z ) + ie(x\k\y)(y\'ip z ) 

or the change in the x-component of ‘i[) z as we move an infinitesimal amount e has 
one part proportional to the cc-component of ■i/’z and a second part proportional 
to the y-component of ip z . 

Similarly, we have 

(ylV’z+e) - (y\i>z) = M y\K\ip z ) (7.124) 

= ie (y\ KI \ip z ) = ie(y\K\x) {x\ip z ) + ie(y\K\y) (y \ip z ) 
Now, since no photons are lost as we pass through, we must have 


{ipz+e H’z+e) = (V’zIV’z) = 1 


(7.125) 


for all 2 . We then get 

(V>z+£ \tp z +e) = (i’z\ipz) + ie[(x\K\x) - (x\K\x)*] I (x\ip z ) 2 
+ ie[(y\I<\y) - (y\K\y)*] \ (y \tp z ) 2 
+ ie[{x\K\y)-{x\k\y)*] (y \ ip z ) (x \ ^ z )* 

+ ie[{y\K\x)~ {y\K\x)*] (x\tp z ) {y\tp z )* 

which says that we must have 

(x\K\x) = (x\K\x)* , (y\ K \y) = (y\ K \y)* 

(x\K\y) = (x\K\y)* , (y\K\x) = (y\K\x)* 

or that K is Hermitian. 

Finally, one can show that U' Z U Z = I so that U z is unitary as expected. From 
our earlier discussions in Chapter 6, we then identify 

U z = transformation operator and K - generator of transformation. 


7.2.7. Calculating the Transition Probability 

We defined the transition probability as 

T(z) = | (^ z ,out) r = | (<t>\ Uz \lpz,in) I 2 (7.126) 
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Using 


U z = e ik “ z |e) (e| + e ik ° z |o) (o| 


and 

| ipin) = a|o) + b\e) where \a 2 + \b 2 = 1 

(7.127) 

we get 

T(z) = | M (e ik ° z \e) (e\ + |o> (o|)(a|o> + 6|e»| 2 

(7.128) 


= 1 (01 (be ikeZ \e) + ae ik ° z |o> | 2 
= \be ikeZ (<j>\e) + ae ikoZ (<j> \ o) | 2 



Now let us ask a specific question. 


Suppose a = l/\/5 = -ib, which means the that photon entering the calcite 
crystal is an LCP photon. 


What is the probability that it will exit as a RCP photon? 
choose 


I0> 


1 

75 


(|o) + i |e» 


This means we 
(7.129) 


or 


(0|e) 


i 

75 


and (4>\o) 


1 

75 


(7.130) 


We then get 


T(z) = \be ik * z {(j)\e) + ae ik ° z (0| o> | 2 

I ^ ^ ^ „ik Q z 1 |2 ^~\„ik 0 z ik e z\2 

= —-e e —+ —=e ° —= = - e — e e 

V2 \/5 V2 V2 4 

_ In + 1 _ J(ko-ke)z _ -i(k 0 -k e )z\ 

4 V ' 

= -(1 - cos (k 0 - k e )z) (7.131) 


If we choose ( k 0 - k e )z - ir, then T = 1 and all the LCP photons are turned into 
RCP photons by a calcite crystal of just the right length. 


This simple example clearly exhibits the power of these techniques. 


7.2.8. Some More Bayesian Thoughts 

The essence of quantum theory is its ability to predict probabilities for the out¬ 
comes of tests based on specified preparations. Quantum mechanics is not a 
theory about reality; it is a prescription for making the best possible predic¬ 
tions about the future based on certain information (specified) about the past. 
The quantum theorist can tell you what the odds are, if you wish to bet on the 
occurrence of various events, such as the clicking of this or that detector. 
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However, a more common activity is the reverse situation where the outcomes 
of tests are known and it is their preparation (the initial state) that has to be 
guessed. The formal name for this process is retrodiction. Retrodicting is the 
analog of forecasting an event, but directed oppositely in time, i.e. to the past 
rather than the future. Just as one might forecast, from a knowledge of physical 
laws along with specific data about the current position and speed of a comet, 
where it will be ten years from now, one might retrodict where it was ten years 
ago. 

Suppose we have the experiment described in Figure 7.2 below: 

oOl 

S PH 



Figure 7.2: Experimental Setup 


where S' is a thermal source of light, P is a polarizer, H is pinhole, C is a 
calcite crystal, and D is a detector with separate counters for the two different 
polarized beams emerging from the calcite crystal. The detector D also makes 
a permanent record of the measured events. We assume that the light intensity 
is so weak and the detectors are so fast that individual photons can be regis¬ 
tered. The arrivals of photons are recorded by printing + or - on a paper tape, 
according to whether the upper or lower detector was triggered, respectively. 
The sequence of + and - marks appears random. As the total number marks, 
N+ and 7V_, become large, we find that the corresponding probabilities (count 
ratios), tend to limits 


N + 

N+ + 7V_ 


cos 2 a 


N- 

N + + N. 


■ sin 2 a 


(7.132) 


where a is the angle between the polarization axis of the polaroid and the optic 
axis of the calcite crystal. 


Now suppose that we do an experiment and find that the two detectors recorded 
4 and 3 events, respectively. What can we infer about the orientation of the 
polarizer? 


This is the so-called inverse probability problem, which as we have seen from our 
earlier discussions in Chapter 5 is an ideal situation to use Bayesian methods. 
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Consider the following description. 

Event B is the outcome of the experiment described above with 4+ detections 
and 3- detections. This is a single experiment and not a set of seven experi¬ 
ments. 


Event A is the positioning of the polarizer at an angle in the interval 9 to 9 + d 9, 
in that experiment. 


Now, in a statistical ensemble , that is, an infinite set of conceptual replicas of the 
same system, the relative frequencies of events A and B define the probabilities 
Prob(A\I ) = Prob(A) and Prob(B\I) = Prob(B), where I is all the information 
about the preparation (conditioning). 


In addition, Prob( AriB\I) is the joint probability of events A and B. This is the 
relative frequency of the occurrence of both events, in the statistical ensemble 
under consideration. Prob(A\B n I) is the conditional probability of A, when B 
is true. As in Chapter 5, we have the relations 

Prob(A n B\I) = Prob{A\B n I)Prob(B\I) = Prob(B\A n I)Prob(A\I) (7.133) 


and 


Prob(A\B n I) 


Prob(B\A n I)Prob(A\I) 
Prob(B\I) 


The last equation is Baye’s theorem. 


(7.134) 


In this equation it is assumed that Prob(B\AnI) is known from the appropriate 
physical theory. For example, in the above experiment, the theory tells us that 
the probabilities for triggering the upper and lower detectors are cos 2 9 and 
sin 2 9 . We therefore, have from the Binomial distribution 

Prob(B = {4,3}|A n I) = ( n + + n ^ 1 Pro b(+\A n I) n+ Prob(-\A n I) n ~ 

n + \nJ. 

= ^y(cos 2 6») 4 (sin 2 6») 3 

= 35 cos 8 0 sin 6 (9 (7.135) 

In order to determine Prob(B\A n I) we still need Prob{A\I) and Prob(B\I). 
These probabilities cannot be calculated from a theory nor determined empir¬ 
ically. The depend solely on the statistical ensemble that we have mentally 
constructed. 


Let us consider the complete set of events of type A and call them Ai, A 2 ,..., 
etc. For example, Aj represents the positioning of the polarizer at an angle 
between 9j and 9j + d9j. By completeness, 

Y J P{A 3 ) = l (7.136) 

3 
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and therefore 


(7.137) 


P(B) = Y,P(B \AjPiAj) 

j 

At this point we introduce Baye’s postulate (different from Baye’s theorem). 
This postulate which we have used in earlier discussions is also called the prin¬ 
ciple of indifference or the principle of insufficient reasoning. 

If we have no reason to expect that the person who positioned the polarizer had 
a preference for some particular orientation, we assume that all orientations are 
equally likely, so that 

P(A) = — (7.138) 

7T 

for every 9 (we can always take 0 < 6 < n because 9 and 9 + ir are equivalent). 
We then have 

1 /~7T 1 QC 

P{B) = Y' P (B\A j P(A j ) = - 35 cos 8 <2 sin 6 (9 ci6> = — (7.139) 

and we then obtain from 
Prob(A\B n I) = 


2 11 

which is the probability that the angle is between 9 to 9 + d 9 given that event 
B is the outcome of a single experiment with 4+ detections and 3- detections. 

Suppose, d9 -1° - 0.0175 rad. If we plot 

q 11 

Prob(9\B= {4,3}, d9 = 0.0175) = —cos 8 9 sin 6 9d9 = 2.283cos 8 9 sin 6 9 (7.141) 

OTT 

versus 9 we have 


Baye’s theorem that 

Prob(B\A n I)Prob{A\I) 

Prob{B\I) 

Tru cos 8 9 sin 6 9 — 2 11 s 

A3!- -= — cos 8 9 sin 6 9d9 


(7.140) 
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Figure 7.3: Most Likely Angles 


This says that, given the single data set, the angle is most likely to be 

0.72 rad = 41.3° or 2.42 rad = 138.7° (7.142) 

Clearly, Bayesian analysis allows us to infer results from one-time experiments 
on single systems. The key is the use of Baye’s postulate of indifference. 


7.3. The Strange World of Neutral K-Mesons 

We can now use the same formalism we developed for photon polarization to 
study elementary particles called K-mesons. 

K-mesons are produced in high-energy accelerators via the production process 

7T~ + p + A° + AT° 

In this reaction, electric charge is conserved. This reaction takes place via the 
so-called strong interactions. Another physical quantity called strangeness is 
also conserved in strong interactions. All A'°-mesons have a strangeness equal 
to +1. 

For every particle there always exists an antiparticle. For the A" 0 , the antiparti¬ 
cle is called the A' 0 . The A'°-mesons have a strangeness equal to -1. A reaction 
involving the K° is 

K°+p + -+ A° + tt + 
which is an absorption process. 

The K-mesons that exist in the experimental world(the laboratory) are linear 
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superpositions of K 0 and K° states in the same way that RCP and LCP photons 
were superpositions of |x) and | y) polarization states. So the world of K-mesons 
can be represented by a 2-dimensional vector space. 

One basis for the vector space is the orthonormal set {| K °), |A' 0 }} where 

(K° | K°) = 1 = (K° | K°) and (K° \ K°) = 0 (7.143) 

Two linear operators are important for the study of K-mesons. 


First, we represent the strangeness operator. We already stated that the states 
\K 0 ) and | K°) have definite values of strangeness, which means that they are 
eigenvectors of the strangeness operator S with eigenvalues ±1 (by convention). 
Using our formalism, this means that 


S' | If 0 ) = | K°) and S\K°) = - \K°) 
a ((K°\S\K°) (/f°| S |/f°}\ (1 0 \ 

\(k°| s\k°) {k°\s\k 0 )) \o -l) 
= \k°) (k°\ - \k°) (k°\ 

in the {|if°), |iU 0 )} basis. 


(7.144) 


(7.145) 


The second linear operator that is important in the K-meson system is charge 
conjugation C. This operator changes particles into antiparticles and vice versa. 
In the K-meson system using the {|K 0 }, IK 0 )} basis we define C by the particle- 
antiparticle changing relations 


C\K°) = \K°) and C\K°) = | K°) 

A ({K°\C\K°) (K°\C\K°)\ (0 1\ 

\(K°| C'jif 0 ) (k°\c\k°)) \1 o) 


(7.146) 

(7.147) 


We can find the eigenvectors and eigenvalues of the C operator as follows 


C\ip) = X\tp) (7.148) 

C 2 |0> = AC |0> = A 2 |0> = / |V>> = W) 


where we have used C 2 = I. This result says that A 2 = 1 or the eigenvalues of C 
are ±1. If we use the {l/f 0 ), | K 0 )} basis and assume that 

(0) = a\K°) + b\l<°) where |a| 2 + \b\ 2 = 1 (7.149) 

we find for A = +1 

C |0) = aC\K°) + bC \K°) = a \K°) + b \K°) 

= |0) = a \K°) + b\K q ) (7.150) 
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or a = b = l/\/2. If we define the +1 eigenvector as | AT#}, we then have 

\K S )=-^(\K°) + \K 0 )) (7.151) 

Similarly, if we define the -1 eigenvector as | Ki), we then have 

\K l ) = -^(\K°)-\K 0 )) (7.152) 

where 

C|A' S ) = \I< S ) and C\K l ) = - \K l ) (7.153) 

Since the commutator [5, C\ t 0, these two operators do not have a common set 
of eigenvectors. This means that both operators cannot have definite values in 
the same state. In fact, the concept of charge conjugation is meaningless for K- 
mesons in the {I A' 0 ), (A' 0 }} states and the concept of strangeness is meaningless 
for K-mesons in the {\K S ), |A" l }} states. 

The {|A' S ), \K l )} states form a second orthonormal basis for the vector space 
(like the RCP and LCP polarization states). 

The standard approach we will follow when studying physical systems using 
quantum mechanics will be to 

1. define the Hamiltonian for the system 

2. find its eigenvalues and eigenvectors 

3. investigate the time development operator generated by the Hamiltonian 
and 

4. calculate transition probabilities connected to experiments 

Along the way we will define the properties of other operators appropriate to the 
system under investigation (like S and C above). It is the job of the theoretical 
physicist to derive or guess an appropriate Hamiltonian. 

Since we are in 2-dimensional vector space, all operators are represented by 2 x 2 
matrices. In the case of the K-meson system, we will assume the most general 
form constructed from all of the relevant operators. Therefore, we assume 

H = MI + AC + BS={^ I+ a B m A _j^ (7.154) 

and investigate the consequences of this assumption. We will assume that the 
matrix has been written down in the {IA' 0 ), |.A 0 }} basis. 
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Step 1 


Investigate the commutators: 

[H,H] = 0 , [H,S]tO , [H,C] + 0 , [5, C\tQ (7.155) 

Since the Hamiltonian always commutes with itself and it is not explicitly de¬ 
pendent on time, the physical observable connected to the Hamiltonian, namely 
the energy, is conserved. 

Since they do not commute with the assumed form of H, neither S nor C is 
conserved in this model. 

When a physical observable is conserved, we say its value corresponds to a good 
quantum number that can be used to characterize(label) the ket vector repre¬ 
senting the physical system. 

Step 2 

Investigate special cases (limits of the most general solution): 

Case of A=0: 


H = MI + BS = 


M+B 

0 


M 




Now 


[H,S] = 0 


(7.156) 


(7.157) 


which means that H and S share a common set of eigenvectors. We already 
know the eigenvectors (non-degenerate) for S and so the eigenvector/eigenvalue 
problem for H is already solved(cZearh/ this is a very powerful rule). We have 


H | K°) = (M + B ) | K°) and H \K°) = (M + B) \K°) 


(7.158) 


We could have surmised this from the diagonal form of the matrix representa¬ 
tion, since the only way the matrix could be diagonal is for the basis states of 
the representation to be the eigenvectors. 


C is not conserved in this case, since [H, C] + 0. 

The energy eigenstates, , |A 0 }} in this case, are a basis for the vector 

space. This is always true for a Hermitian operator. This means that we can 
write any arbitrary vector as a linear combination of these vectors 

\if) = a\K 0 ) + b | K°) = a\E = M + B) + b\E = M - B) (7.159) 
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Now, as we derived earlier, energy eigenstates have a simple time dependence. 
We have 

H\E) = E\E) (7.160) 

U(t) | E) = e ~ iBt,h | E) = e ~ iEt,h \E) 

Therefore, in this case, the time dependence of the arbitrary state vector is given 

by 

| iP(t)) = ae~ i{M+B)t/h | K°) + b e - liM - B)t/h \K°) (7.161) 

This will be a general approach we will use, i.e., expand an arbitrary state in 
energy eigenstates and use the simple time dependence of the energy eigen¬ 
states to determine the more complex time dependence of the arbitrary state. 
Of course, we have to be able to solve the eigenvector/eigenvalue problem for 
the Hamiltonian (the energy operator) of the system under investigation. 

Case of B = 0: 


Now 


H = MI + AC = 


(M 

U 



[H,d] = 0 


(7.162) 

(7.163) 


which means that H and C share a common set of eigenvectors. We already 
know the eigenvectors(non-degenerate) for C and so the eigenvector/eigenvalue 
problem for H is again already solved. We have 


H \K S ) = (M + A) \K S ) and H\K L ) = (M - A) \I< L ) 


S is not conserved in this case, since [H, C\ + 0. 


(7.164) 


The energy eigenstates, {\Ks) , \Kl)} in this case, are a basis for the vector 
space. In this basis 

TT (^ + A /n -i nr\ 

H -( 0 m-a) < 7W5 > 

as expected. 


We could also solve this problem by finding the characteristic equation for the 
Hamiltonian matrix, i.e., since we have 

H\ip) = \ \ip) (H - XI) \ip) = 0 (7.166) 

the characteristic equation is 

= (M - A) 2 -A 2 = 0^-\ = M±A (7.167) 
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Since we have another basis, we can write any arbitrary vector as a linear 
combination of these vectors 

\ip) = a\K S ) + b | K l ) = a\E = M + A) + b\E = M - A) (7.168) 

Therefore, in this case, we have the time dependence 

\fat)) = ae- i{M+A)t,h | K s ) + be~ l{M - A)tlh \K L ) (7.169) 


Step 3 


Solve the general Hamiltonian problem (if possible; otherwise we must use ap¬ 
proximation methods, which we will discuss later). We have 


H = MI + AC + BS = 


M + B A \ 
A M-B) 

We assume that the eigenvectors satisfy H\fa = E\<j>) where 

w -(£) 


This gives 


M+B A Ufa 
A M - B J fa 


or 


M+B-E A 
A M-B-E 




(7.170) 


(7.171) 


(7.172) 


(7.173) 


This is a set of two homogeneous equations in two unknowns. It has a nontrivial 
solution only if the determinant of the coefficients is zero 


M+B-A A 
A M-B-E 


= (M + B - E)(M - B - E) - A 2 = 0 (7.174) 


This has solution 


E± = M ± VA 2 + B 2 (the energy eigenvalues) 


(7.175) 


We solve for the eigenstates by substituting the eigenvalues into the eigen¬ 
value/eigenvector equation 


M + B A 
A M 


(7.176) 

-B)\ ( i>2±) \ ( P2±J 


After some algebra we get 




-A 


fa± Vb±a 2 + b 2 


B ± fa A 2 + B 2 

A 


(7.177) 
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We check the validity of this solution by comparing it the limiting cases; that is 
why we looked at the special cases earlier. 


For B = 0, we have 


~T — = —r = ±1 -► <t>i+ = </> 2 + and </>i_ = -<j) 2 - 
02± TA 

which says that 

I'M = ^ (l) = IM} and I'M = ^2 (-l) = 

which agrees with the earlier results for this case. 

In the other limiting case, A = 0, we have 

< t}A = oo -*• (f> 1+ = 1 and tp 2+ = 0 
92 + 

= 0 < i >\_ = 0 and cj> 2 _ = 1 

</>2- 


which says that 


IM = (j) = | K°) and 1 0 + ) = ^ = \K 0 ) 


which again agrees with the earlier results for this case. 
If we normalize the general solution 


'M 
<t> 2± 




(7.178) 

(7.179) 

(7.180) 

(7.181) 

(7.182) 

(7.183) 

(7.184) 
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and 


= > = 


1 

\A 2+ i 

( B*\/A 2 +B 2 J ' 

1 [ 

\A 2 + ! 

'b * \/A 2 + B 2 } 

1 

A 2 + 

[ 

(bt\/A 2 +B 2 ' s ) 

( A-B±\/A 2 + B 2 


A 


(7.185) 


Step 4 


Look at a realistic physical system that we can relate to experiment. 

In the real world of K-mesons, the Hamiltonian is such that B « A. In this case 
the states {\Ks) ,\Kl)} are almost energy eigenstates or charge conjugation is 
almost conserved. We expect that instead of being able to write 

|<M = \K S ) and cj>- = \K l ) (7.186) 

which would be true if B = 0, we should be able to write 

I'M =cos^\K s ) + sm^\K L ) (7.187) 

6 0 

I0-) = - sin - \K S ) + cos - | K l ) 


where for 6 « 1 we clearly approximate the B - 0 result. Let us see how this 
works. For B « A, we choose 



6 B 
- = —- « 1 

2 2 A 

(7.188) 

and get 

e e B sin § 

- tan - = —- = - 77 

2 2 2 A cos | 

(7.189) 

To lowest order we 

can then say 



. e b e 

sin - = — = o « 1 and cos - = 1 

2 2 A 2 

(7.190) 

to get 

^ co 

fo t -o 

+ 1 

ii ii 

+ 

(7.191) 
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This says that if \ifin) = \then the number 


| (Kg | </>_} | 2 probability of observing a Ks 

| (Kl | (j>-) | 2 probability of observing a Kl 

(Number of Ks)/(Number of Ks and Kl) 

(Number of Kl)/ (Number of Ks and Kl) 

= (Number of K s ) , , 

(Number of K) 

gives the experimental ratio of the number of times we will measure a final state 
of | Ks) to the number of times we will measure the final state of \Kl)- The 
signature for seeing a final state of | Kl) is to see a decay to 3 7r-mesons and 

that of a final state of | Ks) is to see a decay to 2 7r-mesons. The number is 


\(Ks\j>-)\ 2 W|2 

\( K L \< f >-)\ 2 _i< 


(7.193) 


Now experiment gives the result |<5| = 2 x 10 3 . This number is a measure of how 
large of an effect strangeness non-conservation has on this system. 


If B = 0, then charge conjugation is conserved. If B + 0, then charge conjugation 
is not absolutely conserved. So <5 is a measure of the lack of charge conjugation 
conservation in the K-meson system. If we identify the energy eigenvalues as 
the particle rest energies 



M + A = msc 2 , M - A = itilc 2 

(7.194) 

we then have 




A = m s c2 ~ m L° 2 _ 1q -5 v 

2 

(7.195) 

and 

2A6 = 10 - ir m K c 2 

(7.196) 

or 

B 17 

-y = 10~ 17 

rriKC z 

(7.197) 


Thus, this is a one part in 10 1 ' effect! It is one of the best measurements 
ever made. It won the Nobel prize for the experimenters in 1963. It is now 
understood in detail by the standard model of elementary particles. 


Now let us look at Quantum Interference Effects in this K-meson system. 

Suppose that B = 0. Then the energy eigenstates are {| Ks ), | Kl)} with eigen¬ 
values M ± A. Now let 

\1> in ) = \K 0 ) = ^=(\K s ) + \K L )) (7.198) 
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This is not an energy eigenstate so it will evolve in time. Its time evolution is 
given by 


\i>(t)) = e- i ” tlh \i> in ) = e- i ” tlh \K°) 


(7.199) 


= n/2 ( 
1 


-iHt./h 


I K s ) + 


0 -iHt/h 


I K l )) 


= ^ e -HM + A)t/h + e -i(M-A)t/h 

n/2 

The probability amplitude that the initial meson changes (oscillates) into the 
orthogonal state |^°) at time t is given by 


(K° | iP(t)) = ( e -i(M+A)t/ft o | + e ~i(M-A)t/h o | 

v 2 

_ ^—i(M+A)t/h _ g—i(M—A)t/h ^ 


(7.200) 


Finally, the probability that the incoming It °-meson will behave like a A'°-meson 
at time t (that it has oscillated into a K°) is given by 


p /? o(t) = |(^°|^))| 2 = ^[l-cosm] 


(7.201) 


where 


Si.l ((M + A)-(M-J)).- = ^W—(7.202) 
ft ft ft 

What is the physics here? If mj-mi = 0, then fl = 0 and P^ 0 (t) = 0 or the two 
mesons do not change into one another(called oscillation ) as time passes. 


However, if If ms - rriL + 0, then Pjyo(f) + 0 and the two mesons oscillate back 
and forth, sometimes being a K° and sometimes being a A' 0 . This has been 
observed in the laboratory and is, in fact, the way that the extremely small 
mass difference is actually measured. 


This is also the same mechanism that is proposed for oscillations between the 
different flavors of neutrinos. Experiment seems to indicate that neutrino oscil¬ 
lation is taking place and one of the possible explanations is that all the neutrino 
masses cannot be zero! 


7.4. Stern-Gerlach Experiments and Measurement 

(This section follows the work of Feynman and Townsend.) 

From early in our exploration of quantum mechanics, measurement played a 
central role. A basic axiom of quantum theory is the the von Neumann projec¬ 
tion postulate: 
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If an observable A is measured, the result is one of 
its eigenvalues, a. After the measurement, the system 
is projected into the eigenvector | a). If \i/j) is the 
state before the measurement, the probability of this 
occurence is | (a | 0} | 2 . 

Let us now explore the physical consequences of these measurements. The most 
fundamental example is the measurement of the angular momentum component 
of a spin-1/2 particle which takes on only two possible values. We will discuss 
the full theory of spin in Chapter 9. 

The measurement was first carried out by Stern and Gerlach in 1922 to test 
Bohr’s ideas about "space quantization". This was before Uhlenbeck and Goudsmidt’s 
invention of spin angular momentum, so Stern and Gerlach’s results were not 
completely understood. Nonetheless, the results were startling and one of the 
first real pictures of the strange quantum world. 

Consider the force on a magnetic moment which moves through a spatially 
inhomogeneous magnetic field. The potential energy is 

V = p-B(r ) (7.203) 

and thus the force is 

F = -VV = V(p • B(r )) = (p, ■ V)B(f) (7.204) 

Stern and Gerlach set up a spatially inhomogeneous field with one very large 
component (call it the ^-component). 

A schematic diagram of a Stern-Gerlach apparatus is shown in Figure 7.4 below. 



Figure 7.4: Stern-Gerlach Apparatus 
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We then have 


n = z , BocB(z)z+ small x,y components 

f) 7 ? 

=> (fl-V)B(r) oc fx z -—z (7.205) 

oz 

Now quantum mechanically, jl = 7 j J which gives 

( £5 JD \ 

7 J W Jz * (7-206) 

Stern and Gerlach sent Na (sodium) atoms from an oven, collimated into a 
beam, into the region of ingomogeneous field. Na is an alkali atom with one 
valence electron with orbital angular momentum i = 0 (an s-state). Thus, all 
of the angular momentum of the atom is due to the spin angular momentum of 
the single valence electron (spin 1/2). Thus 

-* 2/T B Q -* 771 dB „ . 

p = - ^ F =-/j, B —a z z (7.207) 

ft oz 

Clearly, the spin-up atoms |tx) will experience a different interaction than spin- 
down atoms |j x ) (due to two different eigenvalues of a z ), thereby splitting the 
beam into two spatially separated beams as shown in Figure 7.4 above. We note 
that the spins which emerge from the oven have a random orientation. 

In general, a Stern-Gerlach apparatus takes an incoming beam of particles with 
angular momentum and splits the beam into a number of spots on a screen. 
The number of spots is equal to 2J + 1, where J = angular momentum value of 
the incoming beam. 

We shall see in later chapters that 2 J+l = the number of values allowed quantum 
mechanically for the measurable quantity J • n where n is unit vector in the 
direction of the magnetic field inside the apparatus. 

In the diagram, spot #2 (undeflected beam) is where the beam would have hit 
the screen if no magnetic field were present in the apparatus. Spots #1 and ^3 
are an example the 2 J + 1 = 2 spots we would observe in an experiment when 
J = 1/2 (the original Stern-Gerlach experiment). 

The important features of the apparatus for our theoretical discussion are: 

1. The breakup into a finite number of discrete beams (we will assume we 
are working with J - 1/2 particles and thus have 2 beams exiting the 
apparatus) 

2. The beams are separated in 3-dimensional space and each contains 1/2 of 
the original particles entering the device 

3. The possible values of B ■ n for J = 1/2 are ±h/ 2 for any n. 
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4. One exiting beam contains only particles with J • n = +ft/2 and the other 
beam contains only particles with J - n = -ft/2 

We will represent the beam state vectors by the ket vectors 

|+n) = beam with J • n = +ft/2 (7.208) 

|-n) = beam with J • n = -ft/2 

and the Stern-Gerlach apparatus with a field in the n-direction by SGn. 

5. The above results occur for all beams no matter what the direction of the 
unit vector n 

The S-G experiment has all the essential ingredients of a quantum measurement. 
The quantum degree of freedom, here spin-1/2, is correlated with the final beam 
direction of the atoms. Once the two spots on the screen are resolvable, we can 
measure the spin state of the atom by determining which spatially separated 
beam the atom is in. 

We now report the results of a series of actual experiments. 

Experiment ff\ 

We send N particles into an SGz device and select out the beam where the 
particles are in the state |+z) (we block the other beam). It contains N /2 
particles. We then send this second beam into another SGz device. We find 
that all N /2 exit in the state |+z). There is only one exit beam. Symbolically, 
this looks like Figure 7.5 below: 



Figure 7.5: Experiment ff\ 


This says that when we make a measurement, say J • z and then immediately 
make another measurement of the same quantity, we get the same result as first 
measurement with probability = 1. Since the only way to a measurement result 
with certainty is to be in an eigenstate of the observable, the measurement seems 
to have caused the system to change from a superposition of eigenstates into two 
beams (separated in physical space) each with a definite state of the observable 
J • z (one of its eigenstates). 
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This experiment is called state preparation. We prepare a state by measuring an 
observable and finding one of the eigenvalues. Afterwards, by the von Neumann 
projection postulate, the state is the corresponding eigenvector. In the above 
experiment, by redirecting the spin-up beam into the second SG apparatus we 
are guaranteed to find the eigenstate prepared by the first SG apparatus (with 
the block). Thus a screen would have only one spot after the second apparatus. 

Experiment 2 

We send N particles into an SGz device and select out the beam where the 
particles are in the state |+z). It contains iV/2 particles. We then send the 
selected beam into an SGx device We find that N/ 4 exit in the state |+x) and 
N/4 exit in the state |-x). At the end, there are two exit beams. Symbolically, 
this looks like Figure 7.6 below: 



Figure 7.6: Experiment #2 


The same thing happens if we stop the |+z) beam and let the \-z) beam into the 
SGx device. So an SGx device takes a beam with a definite value of J • z and 
randomizes it, i.e., we once again have two exiting beams with equal numbers 
of particles. This is saying that for a system in an eigenstate of one observable, 
the measurement of an incompatible observable (observables do not commute) 
randomizes the value of the original observable. In this case [J z , J x ] + 0]. 

As we will show in Chapter 9, the |+z) state is not an eigenstate of J x . In fact, 

| + z> = i=(| + x> + |-x» (7.209) 

i.e., a 50-50 superposition of spin-up and spin-down in the x-direction. Thus, we 
see two beams after the SGx apparatus. So even though a pure state is entering 
the SGx apparatus, we do not get a definite value because [J Z ,J X ]± 0]. 

Experiment 3 

We now add a third SG device to Experiment 2. It is an |+z) device. We also 
block the |+x) exiting beam as shown symbolically in Figure 7.7 below. 

We found above that N/ 4 exited in the state |+x) from the SGx device. After 
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Figure 7.7: Experiment 


the third device we find that N /8 exit in the state |+z) and N/8 exit in the state 
|—z). 


What has happened? It seems that making a measurement of J • x on a beam 
with definite J • z modifies the system rather dramatically. 

We did two successive measurements on these particles. Since we isolated the 
+ beam in each case we might be led to think that the beam entering the last 
SGz device (because of our selections) has 

J • z = + ^ AND J • x = + ^ (7.210) 

But the experiment says this cannot be so, since 50% of the particles exiting 
the last device have 

j-a = ~ (7.211) 

We are forced to say that the SGz device takes a definite value of J • x and 
randomizes it so that we end up with two exiting beams with equal numbers of 
particles. 

Why? Again, since [J Z ,J X ] + 0] the two observables cannot share a common 
set of eigenvectors or they are incompatible. Since an observable can only have 
a definite value when the state is one of its eigenvectors, it is not possible for 
both J • z and J • x to simultaneously have definite values. 

Our two successive measurements DO NOT produce definite values for both ob¬ 
servables. Each measurement only produces a definite value for the observable 
it is measuring and randomizes the incompatible observable (actually random¬ 
izes all other incompatible observables)! 

All measurement results depend on the context of the measurement. 

Another way to think about this is to say the following. An SGn device is a mea¬ 
surement of the angular momentum in the z-direction. Any such measurement 
of an observable randomizes the next measurement of any other incompatible 
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observable. 


In this cascaded measurement, the probability of finding |f) is the product of 
conditional probabilities for uncorrelated events 

P(lz | tx.tz) =P(lz I t x)P(lz | tz) = |(lz|tx}| 2 |(tx|tz}| 2 = \\ = \ 
as is observed. 

Now without the intermediate apparatus, classical theory would argue as follows. 
If we don’t measure J x we must sum the probabilities of all possible alternatives 
or 

P{ iz I tz) = P{iz I tx)-P(tx I tz) + P(iz I ix)P(lx |tz) = “ 

which differs from the experimental result (= 0) 

As we have discussed, quantum mechanically, if the different possibilities are 
indistinguishable (i.e., there is no information available to distinguish the dif¬ 
ferent alternatives) we must add the probability amplitudes (before squaring) 
so that these different alternatives can interfere. 

P(lz|tz) = |(l,|tz}| 2 = |(l,|^|tz>| 2 

= IUzl(|tx)(txl + IU4xl)|tz)| 2 

= P(iz | tx)P(tx | tz) + P(lz I lx)P(lx | tz) + interference terms 



So, the quantum amplitude interference calculation gives the correct result. 

What constitutes a measurement? 

If we send atoms with spin-up in the ^-direction into an x-oriented SG apparatus, 
there is a 50 - 50 probability of emerging as spin-up or spin-down in the x- 
direction. 

But if we do not detect which port the atom exits from, did we measure the 
spin in the x-direction? 

In a sense, no. A measurement is something that removes coherence, i.e., the 
ability for quantum processes to interfere. 

In principle, we can recombine these beams as if no measurement had occurred, 
i.e., we can still have interference - we do not remove coherence unless we look. 
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Again the context of the experiment is the significant feature! 

Experiment 4 

Now, let us construct a new device to illustrate this point. It is called a modified 
SGx device. It looks like Figure 7.8 below. 


/\ ' 

X 

h /\ - 

X 










<x 

/\ 

X 



Figure 7.8: Experiment ff 4 


Any beam of particles entering this modified device would experience deflections 
while traveling though it. However, the device lengths and field strengths have 
been cleverly chosen so that the net effect for any beam is no change! The 
internal beams are all recombined so that the state of the beam upon exiting 
the entire device is identical to the state of the beam before entering the device. 
This device might be called a total-of-nothing device. 

It turns out, however, that we can use this device to make a measurement 
and select a particular spin state. We can calculate the paths that would be 
followed in the device by the |+x) and |-x) (these are the relevant beams to 
consider because these are SGx devices). Using this information, we can block 
the path that a particle in the state |-x) would follow. Then, all the particles 
exiting the modified ? device would be in the state |+x). 

Experiment 5 

We can confirm this fact by inserting a modified SGx device into Experiment 3 
where we replace the SGx device by the modified SGx device. If we block the 
|—x) beam, the state at the end of the modified SGx device is the same as it 
was after the original SGx device and we get exactly the same results. In fact, 
we get the same result whether we block the |-x) beam or the |+x) beam, as 
should be the case. 

Now we set up Experiment 5. It is shown in Figure 7.9 below. 
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Figure 7.9: Experiment #5 


In this experiment a beam enters the SGz device and we block the exiting |-z) 
beam. The |+z) beam is sent into the modified SGx device and the exit beam 
is sent into the final SGz device. In this case, however, we DO NOT block any 
of the paths in the modified SGx device. Since the beam entering the modified 
SGx device is reconstructed before it exits (we already saw that it does this 
in the Experiment 4), we are NOT making a measurement of J • x using the 
modified SGx device as we did when we used the original SGx device. 

Now we send in N particles and N/2 are in the |+z) beam as before. However, 
now, instead of find N/8 particles in the final |+z) beam, we find N/2 parti¬ 
cles. ALL the particles make it through unchanged, even though there are SGx 
devices in between. It behaves as if the modified SGx device was not there at 
all(hence its name). 

We might have assumed that 50% of the particles in the |+z) beam before the 
modified SGx device would emerge in the |+x) state and the other 50% would 
emerge in the |-x) state. Experiment 5 says this cannot be true, since if it were 
true, then we would have two beams coming out of the final SGz device, each 
with 50% of the particles. Our results are incompatible with the statement that 
the particles passing through the modified SGx device are either in the state 
|+x) or in the state |-x). 

In fact, if we carry out this experiment with a very low intensity beam where 
only one particle at a time is passing through the apparatus, then we observe 
that each particle emerging from the final SGz device is in the state |+z). This 
eliminates any explanation of the result that would invoke interactions among 
the particles while they were in the apparatus. 

For beams of particles, we have been talking in terms of percentages or fractions 
of the particles as experimental results. For a single particle, however, it is not 
possible to predict with certainty the outcome of a measurement in advance. 
We have seen this earlier in our experiments with photons and polaroids. We 
are only able to use probability arguments in this case. 
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In Experiment 2, for instance, before a measurement (passing through the SGx 
device) of J • x on a single particle in the |+z) state, all we can say is that there 
is a 50% probability of obtaining |+x) and a 50% probability of obtaining |-x). 

Probabilities alone are not enough, however, to explain Experiment 5. We came 
to the incorrect conclusion because we made the same mistake as in our earlier 
discussion of polarization. We added the separate probabilities of the indistin¬ 
guishable ways to get the total probability, whereas the correct result, as we 
know, is to add the amplitudes of the indistinguishable ways and then square 
the total amplitude to get the correct result. We eliminated the interference 
effects! When we donOt actually make a measurement as in the modified SGx 
device, we must add amplitudes and not probabilities. 

We can now introduce a formalism, which should allow us to explain all experi¬ 
ments correctly. We need a 2-dimensional vector space to describe these physical 
systems. As a basis for this space we can use any of the sets {|+n}, |-n)} corre¬ 
sponding to the definite values ±h/2 for J ■ n. Each of these is an orthonormal 
basis where 

(+n |+n) = 1 = (-ii |-n) and (+n|-n} = 0 (7.212) 

Any arbitrary state can be written as a superposition of the basis states 

\ip) = (+n | i/j) |+n) + (-n | i/j) |-n) (7.213) 

and the operators can be written as 

J-n= ^|+n)(+n|-||-n)(-n| (7.214) 

Finally, expectation values are given by 

(J z ) = {ip\ J z |0> = ^|(+z|V>}| 2 -^|(-z|0) | 2 (7.215) 

Analysis of Experiment 3 

In Experiment 3, the state before entering the first SGz device is 

(0i) = a|+z) + b\-z) where |a| 2 + \b\ 2 = 1 (7.216) 

Since the SGz device is a measurement of J z , after the SGz device the state is 
|+z) (remember the we blocked the |-z) path). 

It is very important to realize that we cannot answer a question about a mea¬ 
surement unless we express the state in the basis consisting of the eigenvectors 
of the operator representing the observable being measured. We call this the 
home space of the observable. That is why we used the J z eigenvectors to dis¬ 
cuss the measurement made using the SGz device. 
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Now, we are going to make a measurement of J x in the SGx device. So we now 
switch to a basis consisting of the J x eigenvectors. We know we can write 


|+x) = (+z | +x) |+z) + (—z | +x) |—z) (7.217) 


One of our experiments tells us that when we send a particle in a |+x) state 
through an SGz device, the probability = 1/2 that we find |+z) and 1/2 that we 
find |—z). This means that 



|(+z|+x)| 2 = |(-z|+x)| 2 = ^ 

(7.218) 

or 

„la+ p la+ 

(+ z l+ x >= and ^ Z l +X )= yy 

(7.219) 

so that 

la+ la + 

|+x) = |+z) + | z) 

y/2 y/2 ' 

(7.220) 

Similarly, 

p iP+ „l/3+ 

hX> ' \/2 l+Z) ‘ s/2 HZ) 

(7.221) 


Experiment 6 


In this experiment we replace the last SGz device in Experiment 3 with an SGy 
device. The last part of the setup is shown in Figure 7.10 below. 



N/4 


N/4 


Figure 7.10: Experiment =//6 


The incoming beam is from the first SGz device, where we blocked the |-z) path, 
and therefore, we have N particles in the state |+z). They now go through the 
SGx device, where we block the |-x) path, as shown. The beam entering the 
SGy device has N /2 particles in the |+x) state. 

Whether we call the last direction the ^-direction or the y-direction cannot affect 
the results of the experiment, so we get the same result for this experiment as 
we did for Experiment 3, except that we use the y-label instead of the z-label. 
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If the SGx device were replaced by an SGz device we would get the same result 
also, since whether we call the direction the 2 -direction or the x-direction cannot 
affect the results of the experiment. 

Putting this all together we can say that 



l+y) = (+z 1 +y) l+z) + (-z 1 +y) |-z> 

(7.222) 

with 

|(+z|+y)| 2 = |(-z|+y}| 2 = ^ 

(7.223) 

and 

|+y) = (+x | +y) |+x) + (-x | +y> |— x> 

(7.224) 

with 

1 (+x | +y) | 2 = | (-x | +y) | 2 = ^ 

(7.225) 

The conventional choice for the phases factors is such that we have 



l+i> ^ l+8)+ ^ | - 4 > 

(7.226) 



(7.227) 


We will derive all of these results more rigorously in Chapter 9. 


7.5. Problems 

7.5.1. Change the Basis 

In examining light polarization in the text, we have been working in the {|cc}, |z/)} 
basis. 

(a) Just to show how easy it is to work in other bases, express {|a;},|2/)} in 
the {| R ), | L)} and {|45°}, |135°)} bases. 

(b) If you are working in the {| R ), |L}} basis, what would the operator repre¬ 
senting a vertical polaroid look like? 

7.5.2. Polaroids 

Imagine a situation in which a photon in the \x) state strikes a vertically oriented 
Polaroid. Clearly the probability of the photon getting through the vertically 
oriented polaroid is 0. Now consider the case of two polaroids with the photon 
in the |x) state striking a polaroid oriented at 45° and then striking a vertically 
oriented polaroid. 

Show that the probability of the photon getting through both polaroids is 1/4. 
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Consider now the case of three polaroids with the photon in the |a;) state striking 
a polaroid oriented at 30° first, then a polaroid oriented at 60° and finally a 
vertically oriented polaroid. 


Show that the probability of the photon getting through all three polaroids is 
27/64. 

7.5.3. Calcite Crystal 

A photon polarized at an angle 9 to the optic axis is sent through a slab of 
calcite crystal. Assume that the slab is 1CT 2 cm thick, the direction of photon 
propagation is the ^-axis and the optic axis lies in the x - y plane. 

Calculate, as a function of 9, he transition probability for the photon to emerge 
left circularly polarized. Sketch the result. Let the frequency of the light be 

O 

given by c/w = 5000 A, and let n e = 1.50 and n 0 = 1.65 for the calcite indices of 
refraction. 


7.5.4. Turpentine 

Turpentine is an optically active substance. If we send plane polarized light into 
turpentine then it emerges with its plane of polarization rotated. Specifically, 
turpentine induces a left-hand rotation of about 5° per cm of turpentine that 
the light traverses. Write down the transition matrix that relates the incident 
polarization state to the emergent polarization state. Show that this matrix is 
unitary. Why is that important? Find its eigenvectors and eigenvalues, as a 
function of the length of turpentine traversed. 


7.5.5. What QM is all about - Two Views 

Photons polarized at 30° to the a:-axis are sent through a y-polaroid. An 
attempt is made to determine how frequently the photons that pass through the 
polaroid, pass through as right circularly polarized photons and how frequently 
they pass through as left circularly polarized photons. This attempt is made as 
follows: 

First, a prism that passes only right circularly polarized light is placed between 
the source of the 30° polarized photons and the y-polaroid, and it is determined 
how frequently the 30° polarized photons pass through the y-polaroid. Then 
this experiment is repeated with a prism that passes only left circularly polarized 
photons instead of the one that passes only right. 

(a) Show by explicit calculation using standard amplitude mechanics that the 
sum of the probabilities for passing through the y-polaroid measured in 
these two experiments is different from the probability that one would 
measure if there were no prism in the path of the photon and only the 
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y-polaroid. 

Relate this experiment to the two-slit diffraction experiment. 

(b) Repeat the calculation using density matrix methods instead of amplitude 
mechanics. 


7.5.6. Photons and Polarizers 

A photon polarization state for a photon propagating in the 3-direction is given 

by 



(a) What is the probability that a photon in this state will pass through a 
Polaroid with its transmission axis oriented in the y-direction? 

(b) What is the probability that a photon in this state will pass through a 
Polaroid with its transmission axis y' making an angle <p with the y- axis? 

(c) A beam carrying N photons per second, each in the state \ip), is totally 
absorbed by a black disk with its surface normal in the z-direction. How 
large is the torque exerted on the disk? In which direction does the disk ro¬ 
tate? REMINDER: The photon states |i?) and | L) each carry a unit h of 
angular momentum parallel and antiparallel, respectively, to the direction 
of propagation of the photon. 

7.5.7. Time Evolution 

The matrix representation of the Hamiltonian for a photon propagating along 
the optic axis (taken to be the z-axis) of a quartz crystal using the linear 
polarization states \x) and | y) as a basis is given by 

fr _ ( 0 -iE 0 \ 

(a) What are the eigenstates and eigenvalues of the Hamiltonian? 

(b) A photon enters the crystal linearly polarized in the x direction, that is, 
|V»(0)) = \x). What is |the state of the photon at time f? Express 
your answer in the {|a;}, | y)} basis. 

(c) What is happening to the polarization of the photon as it travels through 
the crystal? 
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7.5.8. K-Meson oscillations 


An additional effect to worry about when thinking about the time development 
of K-meson states is that the |ATl) and \K$) states decay with time. Thus, we 
expect that these states should have the time dependence 

\K L (t))=e- i “^- t /^\K L ) , \ Ks (t)) = e-tost-Hirs \ Ks) 


where 

uj L = E L /h , E l = (p 2 c 2 + m 2 L c 4 ) 1/2 
u s = E s /h , E s = (p 2 c 2 +vi 2 s c A ) 112 

and 

ts » 0.9 x 1CT 10 sec , tl » 560 x 10 -10 sec 

Suppose that a pure K / beam is sent through a thin absorber whose only effect 
is to change the relative phase of the Kq and Kq amplitudes by 10°. Calculate 
the number of Ks decays, relative to the incident number of particles, that will 
be observed in the first 5 cm after the absorber. Assume the particles have 
momentum = me. 


7.5.9. What comes out? 

A beam of spin 1/2 particles is sent through series of three Stern-Gerlach mea¬ 
suring devices as shown in Figure 7.1 below: The first SGz device transmits 



Figure 7.11: Stern-Gerlach Setup 

particles with S z = ft/2 and filters out particles with S z = -ft?2. The second 
device, an SGn device transmits particles with S n = ft/2 and filters out particles 
with S n = -ft/2, where the axis n makes an angle 9 in the x — z plane with 
respect to the z-axis. Thus the particles passing through this SGn device are 
in the state 

|+n) = cos ^ |+£) + e lcp sin ^ | -z) 

with the angle <p = 0. A last SGz device transmits particles with S z = -ft/2 and 
filters out particles with S z = +ft/2. 

(a) What fraction of the particles transmitted through the first SGz device 
will survive the third measurement? 
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(b) How must the angle 8 of the SGn device be oriented so as to maximize the 
number of particles the at are transmitted by the final SGz device? What 
fraction of the particles survive the third measurement for this value of 87 

(c) What fraction of the particles survive the last measurement if the SGz 
device is simply removed from the experiment? 

7.5.10. Orientations 

The kets | h) and |u) are states of horizontal and vertical polarization, respec¬ 
tively. Consider the states 

iVh) = (|^> + n/ 3l v >) , 1^2} = (|fr>- \/3|'c)) , KM = | h) 

What are the relative orientations of the plane polarization for these three 
states? 

7.5.11. Find the phase angle 

If CP is not conserved in the decay of neutral K mesons, then the states of 
definite energy are no longer the Kl , Ks states, but are slightly different states 
| K' l ) and | K' s ). One can write, for example, 

\K' L ) = (l + e)\K°)-(l-e)\K°) 

where varepsilon is a very small complex number (|e| rj 2 x 10 -3 ) that is a mea¬ 
sure of the lack of CP conservation in the decays. The amplitude for a particle 
to be in \K' L ) (or | K' s )) varies as ( or e -*wst-t/2r s ) w } iere 

huj L = (p 2 c 2 + ?n|c 4 ) 1 ^ 2 ^or hus = ( p 2 c 2 + mjC 4 ) 1 ^ j 


and Tj J » ts- 

(a) Write out normalized expressions for the states \K' L ) and | K' s ) in terms of 
|AT 0 > and |A' 0 }- 

(b) Calculate the ratio of the amplitude for a long-lived K to decay to two 
pions (a CP = +1 state) to the amplitude for a short-lived K to decay to 
two pions. What does a measurement of the ratio of these decay rates tell 
us about e? 

(c) Suppose that a beam of purely long-lived K mesons is sent through an 
absorber whose only effect is to change the relative phase of the K 0 and 
K 0 components by 5. Derive an expression for the number of two pion 
events observed as a function of time of travel from the absorber. How well 
would such a measurement (given d) enable one to determine the phase of 
£? 
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7.5.12. Quarter-wave plate 

A beam of linearly polarized light is incident on a quarter-wave plate (changes 
relative phase by 90°) with its direction of polarization oriented at 30° to the 
optic axis. Subsequently, the beam is absorbed by a black disk. Determine the 
rate angular momentum is transferred to the disk, assuming the beam carries 
N photons per second. 


7.5.13. What is happening? 

A system of N ideal linear polarizers is arranged in sequence. The transmission 
axis of the first polarizer makes an angle <p/N with the y-axis. The transmission 
axis of every other polarizer makes an angle N with respect to the axis of the 
preceding polarizer. Thus, the transmission axis of the final polarizer makes an 
angle <p with the y— axis. A beam of y-polarized photons is incident on the first 
polarizer. 

(a) What is the probability that an incident photon is transmitted by the 
array? 

(b) Evaluate the probability of transmission in the limit of large N. 

(c) Consider the special case with the angle 90°. Explain why your result is 
not in conflict with the fact that (x \ y) = 0. 

7.5.14. Interference 

Photons freely propagating through a vacuum have one value for their energy 
E = hv. This is therefore a 1-dimensional quantum mechanical system, and 
since the energy of a freely propagating photon does not change, it must be an 
eigenstate of the energy operator. So, if the state of the photon at t = 0 is denoted 
as |0(O)}, then the eigenstate equation can be written H\ip(0)) = E |0(O)}. To 
see what happens to the state of the photon with time, we simply have to apply 
the time evolution operator 

I m) = U(t ) 10(0)) = e~ lflt ' h |0(O)> = e ihvt < h |0(O)) 

= e - l2 ™‘ |0(O)) = e- l2 ™ /A |0(O)) 

where the last expression uses the fact that v - c/X and that the distance it 
travels is x - ct. Notice that the relative probability of finding the photon at 
various points along the x-axis (the absolute probability depends on the number 
of photons emerging per unit time) does not change since the modulus-square of 
the factor in front of |0(O)} is 1. Consider the following situation. Two sources 
of identical photons face each other an emit photons at the same time. Let the 
distance between the two sources be L. 

Notice that we are assuming the photons emerge from each source in state 
|0(O)). In between the two light sources we can detect photons but we do 
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Figure 7.12: Interference Setup 


not know from which source they originated. Therefore, we have to treat the 
photons at a point along the x-axis as a superposition of the time-evolved state 
from the left source and the time-evolved state from the right source. 

(a) What is this superposition state |at a point x between the sources? 
Assume the photons have wavelength A. 

(b) Find the relative probability of detecting a photon at point x by evaluating 
|(^(i) | ^(t)}| 2 at the point x. 

(c) Describe in words what your result is telling you. Does this correspond to 
anything you have seen when light is described as a wave? 

7.5.15. More Interference 

Now let us tackle the two slit experiment with photons being shot at the slits one 
at a time. The situation looks something like the figure below. The distance 
between the slits, d is quite small (less than a mm) and the distance up the 
y-axis(screen) where the photons arrive is much,much less than L (the distance 
between the slits and the screen). In the figure, Si and S 2 are the lengths of the 
photon paths from the two slits to a point a distance y up the y-axis from the 
midpoint of the slits. The most important quantity is the difference in length 
between the two paths. The path length difference or PLD is shown in the 
figure. 


y-axis 



PLD 


L 


Figure 7.13: Double-Slit Interference Setup 


We calculate PLD as follows: 


PLD = d sin 6 = d ---—— rj — , y « L 

_ [L 2 + y 2 ] 1 2 J L 
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Show that the relative probability of detecting a photon at various points along 
the screen is approximately equal to 



7.5.16. The Mach-Zender Interferometer and Quantum In¬ 
terference 

Background information: Consider a single photon incident on a 50-50 beam 
splitter (that is, a partially transmitting, partially reflecting mirror, with equal 
coefficients). Whereas classical electromagnetic energy divides equally, the pho¬ 
ton is indivisible. That is, if a photon-counting detector is placed at each of the 
output ports (see figure below), only one of them clicks. Which one clicks is 
completely random (that is, we have no better guess for one over the other). 



Figure 7.14: Beam Splitter 


The input-output transformation of the waves incident on 50-50 beam splitters 
and perfectly reflecting mirrors are shown in the figure below. 





Figure 7.15: Input-Output transformation 


(a) Show that with these rules, there is a 50-50 chance of either of the detectors 
shown in the first figure above to click. 

(b) Now we set up a Mach-Zender interferometer(shown below): 
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Figure 7.16: 

Input-Output transformation 


The wave is split at beam-splitter bl, where it travels either path bl-ml- 
b2(call it the green path) or the path bl-m2-b2 (call it the blue path). 
Mirrors are then used to recombine the beams on a second beam splitter, 
b2. Detectors D1 and D2 are placed at the two output ports of b2. 

Assuming the paths are perfectly balanced (that is equal length), show 
that the probability for detector D1 to click is 100% - no randomnessl 

(c) Classical logical reasoning would predict a probability for D1 to click given 

by 

Pu i = P(transmission at b2\green path)P(green path) 

+ P(re flection at b2\blue path)P(blue path) 

Calculate this and compare to the quantum result. Explain. 

(d) How would you set up the interferometer so that detector D2 clicked with 
100% probability? How about making them click at random? Leave the 
basic geometry the same, that is, do not change the direction of the beam 
splitters or the direction of the incident light. 

7.5.17. More Mach-Zender 

An experimenter sets up two optical devices for single photons. The first, (i) 
in figure below, is a standard balanced Mach-Zender interferometer with equal 
path lengths, perfectly reflecting mirrors (M) and 50-50 beam splitters (BS). 
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Figure 7.17: Mach-Zender Setups 


A transparent piece of glass which imparts a phase shift (PS) 4> is placed in one 
arm. Photons are detected (D) at one port. The second interferometer, (ii) in 
figure below, is the same except that the final beam splitter is omitted. 

Sketch the probability of detecting the photon as a function of <j> for each device. 
Explain your answer. 
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Chapter 8 


Schrodinger Wave equation 
1-Dimensional Quantum Systems 


8.1. The Coordinate Representation 

To form a representation of an abstract linear vector space we must carry out 
these steps: 

1. Choose a complete, orthonormal set of basis vectors {|afc}}. 

2. Construct the identity operator / as a sum over the one-dimensional sub¬ 
space projection operators \otk) ( ak\ 

^Ek)kl (8-i) 

k 

3. Write an arbitrary vector \ip) as a linear combination or superposition of 
basis vectors using the identity operator 

\i’) = i\i>) = (ek>mW> = (8-2) 

V fc / k 

It is clear from this last equation, that knowledge about the behavior(say 
in time) of the expansion coefficients (a*, \i/j) will tell us the behavior of 
the state vector | ip) and allow us to make predictions. Remember also, 
that the expansion coefficient is the probability amplitude for a particle 
in the state \ijj) to behave like it is in the state \ak). 

A particular representation that has become very important in the study of 
many systems using Quantum Mechanics is formed using the eigenstates of the 
position operator as a basis. It is called the coordinate or position representation. 

The eigenstates {|5}} of the position operator ( x,y,z) = Q satisfy 

x\x) = x\x) , y|S) = y|S) , z\x) - z\x) (8-3) 
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where the eigenvalues ( x,y,z ) are continuous variables in the range [- 00 , 00 ]. 
They form the basis of the coordinate representation. 

As we saw earlier, in this case, all summations above become integrals in the 
continuous spectrum case and we have 

7 = J~ |S) (i| dx , where dx = dxdydz (8-4) 

H>) = 7 \tp) = J (|x> (S|) | if) dx= J (x\ip) \x) dx (8.5) 

The expansion coefficient in the coordinate representation is given by 

ip(x) = {x\ ip) ( 8 . 6 ) 

Since the inner product is defined for all states \x), this new object is clearly 
a function of the eigenvalues ( x,y,z ). As we will see, it is the probability 
amplitude for finding the particle to be in the neighborhood of the point x in 
3-dimensional space if it is in the state |i p). It is called the wave function. 

The bra vector or linear functional corresponding to l^) is 

('ip\ = (ip\I = J (ip\ (\x) (x\) dx = J {ip\x)(x\dx= J (x\ip)* (x\ dx (8.7) 

The normalization condition takes the form 

(V> | V’) = 1 = (V’l I H>) = f (V> I x) (x | V>) dx 
= J | (x | if) \ 2 dx = J \if(x )\ 2 dx 
- J~ if* (x)if(x) dx ( 8 - 8 ) 

The probability amplitude for a particle in the state | ip) to behave like it is in 
the state | </>}, where 


\4>) = I \4>) = f (l^> (*l) I <t>) dx= J (x\(t>) |S) dx (8.9) 

is given by 

{(f \ if) = ^J {x ](/))* {x\ dx^J {x'\i’)\x’) dx'^j 

= dx J dx 1 (x\cj))* (x 1 \ip) (x\x') ( 8 . 10 ) 

Now, in our earlier discussions, we assumed the normalization condition 

(x | x') = S(x - x') (8-11) 
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(position eigenvectors are not normalizable). This normalization condition ac¬ 
tually follows (we did not need to assume it) from the expansion in basis states. 
We have 

10} = f (x'\ip)\x') dx' 

(51 ip) = J~ (x'\ ip) (51 x') dx' 

ip{x) = y~ ip(x') ($| x') dx' 

which implies the delta-function normalization. Thus, the delta function nor¬ 
malization follows from the completeness property of the projection operators 
or vice versa. 

Using this result in (8.10) we get 

{(p\ip) - J" dx J dx' (x | (f>)* (x 1 1 ip) S(x - x') 

= J~ (x | cj))* (x | ip) dx = y~ <p* (x)ip(x) dx (8.12) 

We formally write the Q operator using the expansion in eigenvalues and pro¬ 
jection operators as 

Q=y x\x)(x\dx (8.13) 

This represents three integrals, one for each coordinate. 

We will also need the properties of the linear momentum operator. The eigen¬ 
states {|p}} of the momentum operator ( p x ,p y ,p z ) = P satisfy 

Px\p)=Px\P) , Py\P)=P v \p) , Pz\p)=Pz\p) (8.14) 

where the eigenvalues ( Px,Py>Pz ) are continuous variables in the range [-oo, oo]. 
They form the basis of the momentum representation. 

We then have 

i= (2nh^I m{ P ld P (8 - 15) 

10) = -^10) = 1 /(10)(01)10) dp= 1 J(p\ip)\p)dp (8.16) 

The expansion coefficient in the momentum representation is 

*(p) = m) (8.i7) 

It is the probability amplitude for finding the particle with momentum p (in the 
neighborhood of) if it is in the state 1 ip). 
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The bra vector or linear functional corresponding to l^) is 


M = {i’\I= {2 * h)3 f (tp\(\p){p\) d P= (2nh) 3 f d P ( 8 - 18 ) 

The normalization condition takes the form 

{V’lV’) = i = <^lrIV 1 ) = f (Mp)ip\i>) dp 

‘ m * } ' 2 dp J^> ‘ I dp 

= (2^/*‘ (,) * (,) ‘ @ (8 ’ 19) 

The probability amplitude for a particle in the state | ip) to behave like it is in 
the state |</>), where 

\4>) = I\<t>) = (2nh) 3 f d P = (2nh) 3 f d P ( 8 - 20 ) 

is given by 

w * } - (piy / {m ' (?l d? ) ((dsy / iP ' m W) dp ') 

- ( 27r fe)6 f d pf dp'(p]#)* (p'\tp)(p\p') ( 8 - 21 ) 

The normalization condition follows from 

M = WWW & 

(p\i’) = ^ 2ttK )3 f d P' 

V(x) = ( 2^)3 J ^{P'){P\P') dp' 

which implies that 

J^m')-KP-P') (8-22) 

Using this result we get 

(01^) = (pl h y f dpf dp' (P\<t>)* (P'\ip) (P\P') 

= J f dpf dp'iPl^y {P'H)KP-P') 

= ( 2 ^ / W) * # = ( 2 ^ / ^ (8 - 23) 
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We formally write the P operator using the expansion in eigenvalues and pro¬ 
jection operators as 

P = Jp\p){p\dp (8.24) 

We will now derive the connections between the two representations. 

We showed earlier (4.413) that 

{x\p) = e ip '* /h (8.25) 

This is, in fact, a key result. It will enable us to derive the famous Schrodinger 
equation. 

Before doing the derivation let us see how this result fits into our formalism and 
further our understanding of its meaning by deriving it in a different way. 

This way uses the Fourier transform ideas we derived earlier. 

First, let us review what we said earlier. We have 

ip(x) = (x\i>) = (x\I\ip) = 3 J (x\p) (p\ijj) dp 

(8 - 26) 

and 

&(p) = (P\tp) = {p\i\^) = f (P\x)(x\ip) dx 

= J~ (x\p)* ip(x) dx (8.27) 

Fourier transforms are written 

g(P) = f e-^ h f(x)dx , m-J^fe^g(p)dp (8.28) 
which agrees with the result (i| p) - e l ^' x ! h . 

It is not a unique choice, however. It was not unique in our earlier derivation 
either. It is the choice, however, that allows Quantum mechanics to make pre¬ 
dictions that agree with experiment. It is Nature’s choice! 

We might even say that this choice is another postulate. 

Now, we can use these results to determine the expectation values of operators 
involving the position and momentum operators. 

Since we are interested in the coordinate representation we need only determine 
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the following quantities. 


The position operator calculations are straightforward 

(x\ Q \ip) = x (x | ip) and (x\ /(Q) \ip) = f(x ) (x \ ip) (8.29) 

For the momentum operator we write 

(i|PM = * f (p\*P) dp 

= (SS)i f wmww 

Now using ( x\ft) = e l ^' x ^ h we have 

ft (x | ft) = -ih V (x\ft) = (x | P |p) (8.30) 

and thus 

<J|f>W= (SSy/ p(J|p) (ft\ip) dft 

= H^v) 13 (pIV')# 

= (2Sp/ (i|?)w)< « i 

= -ih V (5 1 'i/'} (8.31) 

In a similar manner, we can also show that 

(x| P 2 \ip) = (-iftV) 2 = -ft 2 V 2 (51V) (8.32) 

Alternatively, we can show that the gradient is the correct result using the 
symmetry transformation ideas we developed earlier in Chapter 6. 

The momentum operator is the generator of displacements in space. We showed 
earlier that 

e-ia-P/h | 5 j = | S + g) ( 8 .33) 

Therefore, we have 

ip{x + a) = (x + a | ip) = (x| \ip) 

= e- av (5| ift) = ip{x) (8.34) 

Since 

e _s ' v = 1 - a • V + -• • • (8-35) 
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we have the standard Taylor series for ip(x + a). Therefore, the gradient repre¬ 
sentation of the momentum operator makes sense. 


We now use these results to derive the Schrodinger wave equation. 

The Schrodinger wave equation is the partial differential equation that corre¬ 
sponds to the eigenvector/eigenvalue equation for the Hamiltonian operator or 
the energy operator. 

The resulting states are the energy eigenstates. We already saw that energy 
eigenstates are stationary states and thus have simple time dependence. This 
property will allow us to find the time dependence of amplitudes for very com¬ 
plex systems in a straightforward way. 


We have 

HWe) = EWe) (8.36) 

where E - a number and H = the energy operator = (kinetic energy + potential 
energy) operators, i.e., 

H=^ + V (Q) (8.37) 

We then have 


jlV’B) = E(x\ip E ) 

(i| 7T~ VPe) + {x\ V(Q)\ip E ) = E (x | tp E ) 

2 m 

h 2 

- — V 2 (x I tp E ) + v(x) {x I if> E ) = E (x I ip E ) 

2m 

h 2 

- — V 2 0_b(5) + V (x)i/j e (x) = Ei/j e (x) (8.38) 




which is the time-independent Schrodinger wave equation. The quantity 


ip E (x) = {x\4>e) 


(8.39) 


is the wave function or the energy eigenfunction in the position representation 
corresponding to energy E. 


Now the energy eigenfunctions have a simple time dependence, as we can see 
from the following. Since 

U(t) | 1>e) = e~ i6t ' h W E ) = e~ iEt ' h |<M (8.40) 

we have 

|*> U (■ t ) We) = iPe(x, t ) = e~ lEt,h {x\ip E ) 

tp E {x, t) = e~ lEt/h if_ E (x, 0) (8-41) 
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Therefore, 


h 2 

- —\7 2 ip E (x,t) + V(x)ip E (x,t) = Eijj E (x,t) 

2m 

h 2 d 

- —y 2 4> E (x,t) + V(x)ip E (x,t) = ih—ip E (x, t) (8.42) 

which is the time-dependent Schrodinger wave equation. 

We will have much more to say about these equations and how to use them later 
on. For now, however, let us look at these ideas in a couple of different ways to 
try and get a better understanding of what they mean. 


8.2. The Free Particle and Wave Packets 

Let us assume that the potential energy term in the one-particle Schrodinger 
equation is equal to zero. The solution, in this case, is called a free particle. 

We can easily see the form of the solution as a function of space and time as 
follows: the Hamiltonian of the system is 

P 2 -, , , , 

H = — with H\ip E ) = E\ip E ) (8.43) 

2m 

This means that [it, P] = 0 and thus H and P share a common set of eigenvec- 
tors(eigenfunctions). The eigenvectors(eigenfunctions) corresponding to linear 
momentum P are easy to find from our earlier derivations. We have 

P|p) = P\if E ) =p\p) =P\ip E ) 

(x\P\^ E )=p(x\^ E )=pe i P- 3 / h 

= -ihye m/h = -ihV (x | i/) E ) (8.44) 

which says that the eigenfunctions are 

<5 |p) = = 1>E{x) = (x\ip E ) = (x\E) (8.45) 

We can see that this satisfies the time-independent Schrodinger equation with 
no potential term. This solution is also called a plane-wave. 

If we assume that at time t = 0, the state of the particle is 

ip E (x, 0) = (8.46) 

then the time dependence is given by 

\x,t) = U(t)\x,0) = e- iEt/h \x,0) 

(x,t\ip E ) = e ~ lEt,h (x, 01 ip E ) 

^ E (x , t) = e- im/h ^ E {x, 0) = e- lEt/h e ip -*/ h = e i ^ 3 ~ Et)/h (8.47) 
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where E = p 2 /2?n. 


Now we can write the position state vector the as a linear combination of momen¬ 
tum state vectors or correspondingly we can write position space wave function 
as a linear combination of momentum eigenfunctions or plane waves(since they 
are a basis). 


{x\i/j e ) = iPe(x) = 1 f (P\iPe)(x\p) dp 

Mi, 0) = f e^Hy E (p, 0) dp (8.48) 

Using the same derivation as above (8.48) we can show that 

{x,t\ip E ) = i>E{x,t) = J J {p\lp E ){x,t\p) dp 

Mi, = (dhT f ei(P '*- Et)lh * E & °) d P ( 8 - 49 ) 

We note that in the state | p), which is an eigenvector of P, the momentum 
has the value p with probability = 1 (a certainty), which is the meaning of an 
eigenvector! However, the probability of finding the particle at the point x is 
independent of x, i.e., 

|(z|p)| 2 = |e^' 5/ T = l (8-50) 

This means that the particle is equally probable to have any value of momentum. 
The momentum is completely uncertain in a position eigenstate. 

If we consider an intermediate case in which the particle is reasonably well 
localized in position space, and at the same time, has a fairly well defined 
momentum, then the wave function for such a state is called a wave packet. 

Suppose that the function 0)| is peaked about the value p o, with a width 

« A p. In particular, at t = 0, we choose 

^ E (p,0) = g(p)e ia ^ (8.51) 

where g(p), called the weight function, takes the form of a real, nonnegative 
function such as shown in Figure 8.1 below, and the phase a(p) is real. 
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p 


Figure 8.1: Weight Function 


We then have 


/ a{p)e^~ Et+hamih dp (8.52) 

as the wave function of a particle whose momentum is approximately po . This 
means that most of the time, if we measure the momentum of the particle in 
this state, then we would find a value within A p of po- 

We now determine how this particle moves in time and why we have called this 
expression a wave packet. 

If for some position x (which is a parameter in the integral) the phase of the 
exponential term is not slowly varying for some values of integration variable p, 
then the integral will equal zero. This occurs because the exponential term is 
a very rapidly oscillating function (since h is very small), i.e., if you multiply a 
very rapidly varying function by a function of the assumed form of g(p), then on 
the average the integrand is zero. In effect, we are seeing destructive interference 
between the plane waves. 

If, however, x is a point such that the phase of the exponential remains fairly 
constant over the same range of p where the function g(p) is nonzero, then we 
will get a nonzero value for the integral and the particle will have some nonzero 
probability of being at that point. 

One way to make this work is as follows: 

1. At time t, assume the point ay makes the phase stationary near po- By 
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stationary we mean 


Vp(p-x t - Et + ha(p))p = p 0 = 0 (8.53) 

or we are near an extremum of the phase and hence the phase will be 
slowly varying. 

2. We will then get strong constructive interference between the plane waves 
and ipE(x,t) will be nonzero. 

Solving the stationary phase equation for x t we find, using E - p 2 /2m 


(x t - v p (E)t + h{y P a {p)))p=Po = 0 
Xt = ((pE)t + h(Vpa(p)))p = p 0 

= — t - h('V p a(p)) = — t + xo (8.54) 

m m 

Therefore, the point x t , which is essentially the center of the wave packet, 
moves in time with a constant velocity v = po/m- This is exactly the velocity 
one expects for a free particle of momentum po and mass to. It is called the 
group velocity of the wave packet. This procedure is called the stationary phase 
method. 


How localized is this wave packet(particle) or what is the spatial extent of the 
packet? 


We will have a nonzero integral and hence a nonzero probability for the particle 
to be at a particular point as long as we have constructive interference or as 
long as the exponential undergoes less than one oscillation as p varies over the 
region for which the function g(p) is also large. The change in the phase of the 
exponential as we vary the x-component of p is approximately 


1 d 

Acj> = A (p ■ x - Et + ha(p)) p=Po = - A p x -— ( p-x- Et + ha (p )) p =Po 

•I uPx 

= jAp x (x- -^—{Et + ha(p)\ = ^-A p x {x - x t ) (8.55) 

n- \ opx )p=p 0 h 


As long as A</> < 27r we will get constructive interference and the integral will 
be nonzero. This tells us the extent of the wave packet in the x-direction about 
the point x t . We get 


\x-x t \ > 


2irh 

A p x 


or \x - x t |A p x > h 


or 


AxA p x > h 

Similarly we can show that AyAp y > h and AzAp z > h. 


(8.56) 

(8.57) 


543 



These relations between the spatial extent of the wave packet or the uncertainty 
in its position, and the uncertainty in the particle momentum, are often called 
the Heisenberg uncertainty principle. We will see below that the uncertainty 
principle involves more than just these simple properties of a wave packet. We 
will also develop more details about wave packets later on this book. 

Note that there is no uncertainty relation between 


Ax and Ay, A z, A p y , A p z 
Ay and Aa;, A z, Ap x ,Ap z 
Az and Ax, Ay, Ap x , Ap y 

What is the difference between these pairs of variables? The difference is in 
their commutators. It turns out, in this example of the wave packet, that 
only those observables that are represented by non-commuting operators have 
nonzero uncertainty relations. 

This result has nothing to do with wave packets, Schrodinger equations, wave 
functions or Fourier transforms, although all of these quantities can be used to 
show that uncertainty relations exist. 

The existence of uncertainty relations between non-commuting observables was 
actually built into the theory of Quantum Mechanics when we assumed that we 
were in an abstract linear vector space. All linear vector spaces have a property 
called the Schwarz inequality and it always leads to such uncertainty relations 
for non-commuting operators. So the Heisenberg uncertainty principle, to which 
much mysticism has been attached, is really an assumption and little more than 
a lemma that follows from the original assumption of a vector space. 


8.3. Derivation of the Uncertainty Relations in Gen¬ 
eral 


Given two Hermitian operators A and B we define the two new operators 


D A = A - (A) and D B = B- (B) 


(8.58) 


where ( O) = (ip\ O \ip) equals the average or expectation value in the state In 
the statistical analysis of data, we use a quantity called the standard or mean- 
square deviation as a measure of the uncertainty of an observed quantity. It is 
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defined, for a set of N measurements of the quantity Aq by 


N 


(Aq) 2 = (standard deviation ) 2 = -*7 XX* “ QaverageY 

^ ’ i= 1 


1 N 1 iv I jv 1 

1 12 1 ^ - 1 ^ - 1 

N 


N 


y'.(Qi) jy ^(QiQaverage) M ^(laverageQi) + M ^( Qaverage ) 


N 


N. 


2=1 ly 2=1 
= ( 9 2 ) average (Qaverage ) 

where we have used 

1 N 

Qaverage ~ ~TZ Qi 

A 2=1 

In analogy, we define the mean-square deviations for A and B as 

(A A ) 2 = (A 2 )-(A ) 2 = ((A-(A)) 2 ) = (D 2 a ) 

(AB) 2 = (B 2 ) - (B) 2 = ((B - (B)) 2 ) = (D 2 B ) 

We then have 


(8.59) 

(8.60) 


(8.61) 

(8.62) 


(■ AA) 2 (AB) 2 = (D 2 a )(D 2 b ) (8.63) 

Now we assume that 

[Da, B] = [A, B] = [D a , Db] = iC (8.64) 

where C is also a Hermitian operator and we let 

|a) = D a \ip) = (i - (i» |^) and |/3) = D B \<p) = (B - (B)) \^) (8.65) 

Then we have 

(Ai ) 2 = (D 2 A ) = D\ ]V>> = m b A )(b A |^» = (a\a) ( 8 . 66 ) 

(AB) 2 = (D%) = (Y\D'bW = ((ip\b B )(b B \ip)) = </?|/3) (8.67) 

The Schwarz inequality says that for any two vectors we must have the relation 

(a\a) (f3\f3) >\ (a\f3) \ 2 ( 8 . 68 ) 

We therefore have 

(Ai) 2 (AB ) 2 = (b 2 A )(D 2 B ) = (a| a) </3|/3> > | (a\f3) | 2 
(Ai) 2 (AB ) 2 > | (i/>| D A D B W) | 2 = \(b A D B )\ 2 (8.69) 

Now 

.A, 1 


\(D A b B )\ 2 = \(AAAB )\ 2 = |(-[Ai, AB] + -{Ai, AS })| 2 
= \(±[A,B]A{A,B })\ 2 


(8.70) 
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where 


[A A, AB]^ = -[AH, AB] -» anti-Hermitian -» expectation value is imaginary 
{AH, AB}^ = {AH, AB} -*• Hermitian -*■ expectation value is real 

Therefore, 

(AA)\AB ) 2 >\(±[A,B] + ±{A,B })\ 2 

>\\(iC) + a \ 2 

where a is a real number. Then 

(■ AA) 2 (AB) 2 > i|a| 2 + \\{C)\ 2 > \\{C)\ 2 (8.71) 

since |a| 2 /4 > 0. This is the Heisenberg uncertainty principle. It is simply the 
Schwarz inequality!! 

The standard form in most texts follows from [i, p x ] = ih, which gives 

A , A , h 
Ax A p x > - 

We note that if [A, B]=iC = 0 , we then get 

(AA) 2 (AB) 2 > 0 

or commuting observables do not have an uncertainty principle! 

8.3.1. The Meaning of the Indeterminacy Relations 

What is the significance of indeterminacy relations in the world of experimental 
physics? 

Consider the experimental results shown in Figure 8.2 below: 


(8.72) 

(8.73) 
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Figure 8.2: Experimental Data 


These are frequency distributions for the results of independent measurements 
of Q and P on an ensemble if similarly prepared systems, i.e., on each of a 
large number of similarly prepared systems one performs a single measurement 
(either Q or P). The histograms are the statistical distribution of the results. 

The standard deviations (variances) as shown below must satisfy (according to 
the theory) the relation 

AqA p > ~ (8.74) 

They must be distinguished from the resolution of the individual measurements, 
6Q and SP. 

Let me emphasize these points: 

1. The quantities Aq and A P are not errors of measurement. The errors or 
preferably the resolutions of the Q and P measuring instruments are Sq 
and Sp. They are logically unrelated to Aq and A p and to the uncertainty 
relations except for the practical requirement that if 

Sq > A Q (or 5 P > A P ) (8.75) 

then it will not be possible to determine Aq (or Ap) in the experiment 
and the experiment cannot test the uncertainty relation. 

2. The experimental test of the indeterminacy relation does not involve simul¬ 
taneous measurements of Q and P, but rather it involves the measurement 
of one or the other of these dynamical variables on each independently pre¬ 
pared representative of the particular state being studied. 

Why am I being so picky here? 

The quantities Aq and A P as defined here are often misinterpreted as the 
errors of individual measurements. This probably arises because HeisenbergOs 
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original paper on this subject, published in 1927, was based on an early version 
of quantum mechanics that predates the systematic formulation and statistical 
interpretation of quantum mechanics as it exists now. The derivation, as carried 
out here was not possible in 1927! 


8.3.2. Time-Energy Uncertainty Relations 

The use of time-energy uncertainty relations in most textbooks is simply incor¬ 
rect. Let us now derive the most we can say about such relations. 


Earlier (6.380), we showed that 


d(Q)t 

dt 




Tr 


^Wo[H,Q H (t)]+Wo 

n 



(8.76) 


in the Heisenberg picture. So we can write in general that for any operator Q 


d(Q) 

dt 


UiQ,H]) + ( 

in 


dQ 
dt' 


(8.77) 


Now consider a system whose Hamiltonian H does not explicitly depend on time 
and let Q be another observable of this system which does not depend on time 
explicitly so that 

^ = U[Q,H]) (8.78) 

dt in 

We consider the dynamical state of the system at a given time t. Let \ip) 
be the vector representing that state. Call Aq and A^ the root-mean-square 
deviations of Q and H, respectively. Applying the Schwarz inequality (as in 
section 8.3) to the vectors (Q - (Q )) \i/i) and (H - (H)) \ip) and carrying out the 
same manipulations, we find after some calculations 

AQAE>±\{[Q,H])\ (8.79) 

the equality being realized when | ip) satisfies the equation 

(Q~a)\ip) = ^(6 -e)\ip) (8.80) 


where a, 7 , and e are arbitrary real constants. We then have from (8.78) 


or 


where we have defined 


AQ 

\ d(Q) \ 
I dt 1 


A E>- 
2 


t Q AE * I 

= AQ 
TQ I rf(Q> I 
I dt I 


(8.81) 


(8.82) 

(8.83) 
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tq appears as a time characteristic of the evolution of the expectation value 
of Q. It is the time required for the center of the center (Q) of the statisti¬ 
cal distribution of Q to be displaced by an amount equal to its width A Q. In 
other words, the time necessary for this statistical distribution to be apprecia¬ 
bly modified. In this way we can define a characteristic evolution time for each 
dynamical variable of the system. 

Let r be the shortest of the times thus defined, r may be considered as the 
characteristic time of evolution of the system itself, that is, whatever the mea¬ 
surement carried out on the system at an instant of time f', the statistical 
distribution of the results is essentially the same as would be obtained at the 
instant t, as long as the difference |f - 1'\ is less than r. 

This time r and the energy spread A E satisfy the time-energy uncertainty re¬ 
lation 

tAE > ^ (8.84) 

If, in particular, the system is in a stationary state where d(Q)/dt = 0 no matter 
what Q, and consequently r is infinite, then A E = 0 according to (8.84). 

Ordinary time t is just a parameter in non-relativistic QM and not an operator! 
We cannot say that 

At AE > ^ (8.85) 

which is an equation that has no meaning! 


8.4. The Wave Function and Its Meaning 

The Schrodinger equation 

h 2 d 

+ V(x)^ E (x,t) = ih—^ E (x,t) (8.86) 

2m at 

has the mathematical form similar to that of a type of equation called a wave 
equation. 

Since other wave equations imply the existence of real physical fields or waves 
propagating in real three dimensional space, a possible interpretation of the 
Schrodinger wave function ipE(x,t) is that of a wave in real, three dimensional 
space. We might also associate the wave field with a particle or a particle could 
be identified with a wave packet solution to the Schrodinger equation, as we did 
earlier in our discussion. 

These are all misinterpretations!!! 

We are being misled here because we are working with a simple system, namely, 


549 



a single particle in three-dimensional space. 

To see that these interpretations are not valid, we must look at the more com¬ 
plicated case of a system of N particles. We now generalize to a coordinate 
representation for the system of N particles as follows: 

1. Choose as a basis the set of vectors that is a common set of eigenvectors 
for the N position operators Q^ 1 ), Q*- 2 \ ..., Q*- 1 ' 1 - 1 corresponding to the N 
particles. 

2. Assuming that each position operator satisfies an eigenvalue/eigenvector 
equation of the form 

Q (i) \x (i) )=x (i) \x (i) ) , i = l,2,...,N (8.87) 

and that each of the sets of single particle eigenvectors forms a basis for 
the 3-dimensional subspace of the single particle. The basis states for the 
N particle system are then the direct product states among the N sets of 
single particle eigenvectors. We write them as 

\x^\xV\ .. ,iW) = |a«) ® \x^) ® ... ® |sW) ( 8 . 88 ) 

The state vector |T) representing an N particle system is then represented in 
the .3 N- dimensional configuration space corresponding to the 31V coordinates of 
the N particles by 

^(x W ,x (2 \...,x (N) ) = (x {1 \x {2) ,...,x {N) |$) (8.89) 

The Hamiltonian for the N particle system is given by 

N 

H= £ H n + U{x w ,x {2 \...,x w ) (8.90) 

71=1 

where 

H n = single particle Hamiltonian for the n th particle 

= -^V 2 + V(J (n) ) (8.91) 

2 m n 

and U(x^\x^ 2 \ ... ,x^ N ^) = the interparticle interaction potential energy. 

The equation of motion(the IV-particle Schrodinger equation) is then 

i7T(S ( 1 ) ,x (2) ,... ,x (N) ) = ih2-^(x^\x (2) ,.. .,x (N) ) (8.92) 

at 

As we shall now see, this N particle equation does not allow any of the interpre¬ 
tations that might seem to be valid for the single particle Schrodinger equation. 
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If we associated a physical wave in real, three dimensional space with a particle 
or if a particle were to be identified with a wave packet, then there would have 
to be N interacting waves in the real, three dimensional space where we actually 
measure things. It is clear, however, that the N particle Schrodinger equation 
says that this is NOT the case. There is only one wave function in an abstract 
.3 N -dimensional configuration space. 

We are able to make the incorrect interpretation in the one particle case because 
it just happens, in that case, that the real, 3-dimensional has a one-to-one corre¬ 
spondence to the 3-dimensional configuration space of the Schrodinger equation. 

The proper interpretation of is that it is a statistical state function. It is noth¬ 
ing more than a function that enables us to calculate probability distributions 
for all observables and their expectation values. The important physical quanti¬ 
ties are the probabilities and the NOT the function used to calculate them. 

x^ 2 \ ... ,5*' Ar - l )| 2 = probability density in configuration space for 
particle 1 to be at x ^ 
particle 2 to be at x ^ 


particle N to be at 

We can demonstrate the necessity for a purely statistical interpretation for the 
IV-particle wave function as follows. Consider the experiment with photons 
shown in Figure 8.3 below. 



Figure 8.3: Experimental Setup 
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We have a source which emits light (photons). This light falls on a half-silvered 
mirror, which allows 1/2 of the light to be transmitted though and 1/2 to be 
reflected as shown. The transmitted and reflected light is detected with the 
two photomultipliers D 1 and D 2, respectively. Finally signals generated by the 
detectors when they record a photon hit, are sent to a coincidence counter, 
which records a count when it receives two signals at the same time (one from 
each detector) to within some specified accuracy. 

Now, suppose that the source can be described as an emitter of identical localized 
wave packets (| v k| 2 is nonzero only over a finite region of space). Assume that 
these packets are emitted at random times, at an average rate R per second and 
that no more than one packet exists in a small interval A. The probability of 
an emission in any short interval of length At is then 

p = RAt (8.93) 

In each of these intervals, the detectors either see a photon or they do not. 
The experimental arrangement guarantees that over long periods of time each 
detector sees 1/2 of the emitted photons. 

We can learn more about the details of what is happening as the detectors 
record photon hits by monitoring the temporal response of the two detectors, 
in particular, we can ask whether the detectors have both recorded photon hits 
during the same interval. 

The experimental procedure is to send light into the system and record the 
number of coincidences relative to number of individual counts of the detectors. 


The results are analyzed in terms of an anti-coincidence parameter A given by 


A = 


Pc 

PlP-2 


(8.94) 


where 

Pi = experimentally measured probability of detector 1 responding 
P 2 = experimentally measured probability of detector 2 responding 
P c = experimentally measured probability of coincidences 

If light is composed of single particles (photons) and photons are not divisible 
and |'F(;r,t )| 2 = the probability per unit volume at time t that the photon will 
be located within some small volume about the point x, then the two detectors 
should never both record a hit in the same interval (they are mutually exclusive 
events) and thus P c = A = 0. 


As we shall see later on when we analyze the space-time behavior of wave pack¬ 
ets, if the photon is a wave packet, then the packet will split into two equal 
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parts, one on each path. This means that both detectors will always record hits 
during the same interval and thus P c = 1. 

The probability of a detector recording a count is proportional to the amount 
of the wave amplitude in the region of the detector, in particular, to 

f I'Fl 2 dx (8.95) 

Jdetector v ' 

volume 

For a symmetrical, equal path device, this is exactly p/2 for each detector. If the 
detectors are far apart, the respective triggerings are independent (we assume 
that a spacelike interval exists between the events). Therefore, the probability 
of coincidence is given by 

Pc = J (8.96) 

Experiments of this type were carried out by Clauser(1974) and Grangier, Roger 
and Aspect(1986). 

They obtained A = 0. This confirmed that light is photons or single particles 
with a statistical interpretation of the wave function and not particles repre¬ 
sented by wave packets or wave fields of some kind. 


8.5. One-Dimensional Systems 

The time-independent Schrodinger equation in 1-dimension is 

-^—\/ 2 %p E {x) + V(x)tp E (x) = Eip E (x) (8.97) 

2m 

The solutions ip E ( x ) are the energy eigenstates (eigenfunctions). As we have 
seen, their time dependence is given by 

ip E (x,t ) = e~ zEt ^ h ip E (x,0) where ip E (x,0 ) = ( x\E) (8.98) 

and 

- - p 2 

H\E) = E | E) where H= — + V(x) (8.99) 

2m 

We are thus faced with solving an ordinary differential equation with boundary 
conditions. Since ip E (x) is physically related to a probability amplitude and 
hence to a measurable probability, we assume that ip E (x) is continuous. 


Using this assumption, we can determine the general continuity properties of 
d r tp E (x)/dx. The continuity property at a particular point, say x = Xq, is derived 
as follows: 


L 


x o +t d?ip E (x) 
D-e dx 2 


dx = 


r Xo+e d /(khixY\ 

Jx o-e \ dx ) 


IXQ-tL 

2m 

H 2 


dx 

x 0 +e 


( 8 . 100 ) 


rx 0 +e rx o+e 

E / ip E (x)dx- / V(x)ip E (x) dx 

J Xn-e Jxn-e 
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Taking the limit as e -»■ 0 we have 


lim 

e —*0 


dip E (x) 


dx 


X=Xq+€ 


dip E (x) 

dx 


2 m r r x ° +e r x ° +e 

—— Elim / tp E (x)dx- lim / V(x)il> E (x) dx 
h^ L e— ^0 J xQ-e e->0 J xq-c 


. ( dlp E (x)\ 2 TO,. r X 0 + \r, \ i / \ 7 

A fc? L-. v(x)fe(x) d * 

where we have used the continuity of ip E (x) to set 


( 8 . 101 ) 

( 8 . 102 ) 


r x o+e 

lim / ijj E (x)dx = 0 (8.103) 

e— *0 Jxo~e 

This make it clear that whether or not dip E (x)/dx has a discontinuity depends 
directly on the properties of the potential energy function. 


If V(x) is continuous at x = Xq (harmonic oscillator example later), i.e., if 

lim \V{xq + e) - V(xq - e)] = 0 (8.104) 

e -»-0 

then 


A 


dx 


AMx)\ ta,. m /-** f , l)Wl|1[ = 0 (8.105) 

J h 2 e^Ojj 0 -e 


and dif> E (x)/dx is continuous at x = Xq. 

If V(x) has a finite discontinuity (jump) at x = Xq (finite square well and square 
barrier examples later), i.e., if 


lim [V ( x 0 + e) - V (x 0 - e)] = finite 

e->0 


(8.106) 


then 


A ( j = ^ lim J V(x)ip E (x)dx = 0 (8.107) 


\ dx 

and dil> E (x)/dx is continuous at x = Xg- 


Finally, if V(x) has an infinite jump at x = Xq (infinite square well and delta- 
function examples later), then we have two choices: 

1 . if the potential is infinite over an extended range of x (the infinite well), 
then we must force i/j e (x) = 0 in that region and use only the continuity 
of ■; p E (x) as a boundary condition at the edge of the region 
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2 . if the potential is infinite at a single point, i.e., V(x) = A5(x- xq), then 


A 



2 t ?7 rx 0 +e 

= — lim/ V(x)ip E (x) dx 
fl z Jx 0 -e 


2m 

TT lim 

h 2 


L 


x 0 +£ 


AS(x - xo)iPe(x) dx 


2mA 


lim iPe(xq) 
€—>0 


2mA 


i!>e{x o) 


(8.108) 


and, thus, dtp E (x)/dx is discontinuous at x = £o- 

The last thing we must worry about is the validity of our probability interpre¬ 
tation of iI>e(x), i.e., 


ip E (x) - (x\iPe) = probability amplitude for the particle 
in the state | ip E ) to be found at x 


which says that we must also have 


X oo 

\i/j E (x)\ 2 dx < oo (8.109) 

oo 

This means that we must be able to normalize the wave functions and make the 
total probability that the particle is somewhere on the x-axis equal to one. 

A wide range of interesting physical systems can be studied using 1-dimensional 
potential energy functions. We will consider potentials in the form of square 
wells and square barriers, delta-functions, linear functions and parabolic func¬ 
tions. 

We will now learn how to work with potentials in 1-dimension by doing a variety 
of different examples. Along the way we will learn some very clever techniques 
and tricks and expand our understanding of the physics and mathematics of 
Quantum Mechanics. 

We start with a simple system to illustrate the process for solving the Schrodinger 
equation. 

8.5.1. One-Dimensional Barrier 

Consider the potential energy function 

n*) = \° v x ;l ( 8 . 110 ) 

I Vo x > 0 

which looks like Figure 8.4 below. 
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Y=0 



x=0 


Figure 8.4: Finite Step barrier 


Since there are no infinite jumps in the potential energy function, both ip(x) and 
dip(x)/dx are continuous everywhere. We have two distinct regions to consider, 
which are labeled I and II in Figure 8.4. 


In region I, V ( x ) = 0 and the Schrodinger equation is 

h 2 d 2 ipi(x) 


= Eipi(x) 


Let us define k by 


2 m dx 2 

E -_P 2 


2m 2 m 

There are two possible solutions to this equation in region I, namely, 

ipi(x) = e ±ikx 


each with energy 


E = 


V 2 h 2 k 2 


2m 2m 

The most general solution is a linear combination of the two possible solutions 

</>j(x) = Ae +ikx + Be~ ikx 


( 8 . 111 ) 

( 8 . 112 ) 

(8.113) 

(8.114) 
rlutions 

(8.115) 


This is a linear combination of two waves. Since this is an energy eigenstate it 
has a time dependence 


g-iEt/h 

If we insert the time dependence we have 

V>i(®, t) = Ae^ kx - Ut) + Be~^ kx+U,t) , w = - 

h 


(8.116) 


(8.117) 


The first term is a traveling wave moving in the +a: direction with phase velocity 
oj/k and the second term is a traveling wave moving in the —x direction with a 
phase velocity u>/k. 


We can think of this solution in region I as representing an incident wave moving 
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in the +x direction with amplitude A and a reflected wave moving in the —x 
direction with an amplitude B. The reflection takes place at the potential 
discontinuity much the same as a reflection occurs for light on the interface 
between air and glass. We will show this clearly using wave packets later in this 
section. The wave packets will also allow us to relate these results to particle 
motion. 


In region II, V ( x ) = Vo and Schrodinger’s equation is 

h 2 cPipu(x) 


2m dx 2 


= (E- V 0 )ipn(x) 


If we define 


7 2 =f?(^o) 

We then again have two possible solutions 

M*) = e ±ilx 

Two cases arise 


E > Vo -* 7 is real -* traveling wave solutions 
E < Vq -* 7 is imaginary-* real exponential solutions 


(8.118) 

(8.119) 

( 8 . 120 ) 


We first consider E >V o , 7 real. We then have the general solution in region II 

tin (a) = Ce +i ^ x + De~^ x (8.121) 

Now the actual physical experiment we are considering must always restrict the 
possibilities (always remember that we are physicists and not just solving an 
abstract mathematical problem). 

If we assume the incident wave is traveling in the +x direction in region I (coming 
in from x = -oo), then the existence of a wave in region I traveling in the —x 
direction makes sense. It is a reflected wave (reflection occurs at x = 0). In region 
II, however, we cannot physically have a wave traveling in the —x direction (what 
is its source?). We can, however, have a wave traveling in the +x direction. It 
is the transmitted wave (again think of an analogy to an air-glass interface). 


So, on physical grounds, we will assume that D = 0. We then have the solutions 
for E > Vq , 7 real 


^j(x) = Ae +ikx + Be~ ikx x<0 

iPn(x) = Ce +i ~* x x> 0 , 7 2 = 

We now use the continuity conditions at x = 0. 

4>i( 0 ) = ipn( 0 ) and = 

ax 


• k2 -> 

( 8 . 122 ) 


(8.123) 

#7/(0) 

dx 

(8.124) 
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which give the equations 

A + B = C and ik( A - B) = ijC 
which imply the solutions 


C 


2k B k - 7 
and — = 


(8.125) 


(8.126) 


A k + 7 A k + 7 

We can see the physical content by looking more deeply at the probability 
ideas associated with the Schrodinger equation. For a one-particle state, the 
Schrodinger equation formalism says that 

/ ip* (x)ip(x) d 3 x = probability particle is located in volume fl (8.127) 

Jn 

The time rate of change of this probability is 


d_ 

Of, 


J^ip*(x)ip(x)d 3 x = V’*^+V’ <9 ^ 


dt \ 


d 3 x 


ih 
2 m 
ih 


J' [ip*V 2 ip - ip\/ 2 ip*] d 3 x 

= — f V • [ip*Vip- ipvip*] d 3 x 
2to Jn 


(8.128) 


where we have used the time-dependent Schrodinger equation and its complex 
conjugate 

h 2 r)il> h 2 rill)* 

- V 2 ip + V(x)ip = ih— and- V 2 ip* + V(x)ip* = -ih—-— (8.129) 

2 to dt 2m dt 

Since the volume Q is arbitrary, this (8.128) implies a continuity equation of the 
form 

^\ip(x,t)\ 2 + V- J(x,t) =0 (8.130) 


where 


ih 


J{x,t) = [ip*Vip-ipS7ip*] 
2m 


(8.131) 

is called the probability flux vector or the probability current density. 

Using the barrier solutions for E > Vq , 7 real, we have (for ^-components) 

ikx ikBe ~ ikx ' 


ih 

Ji(a;,i) = — {A*e~ lkx + B*e lkx ){ikAe l 
2m 


c ) 


ih 


- —{Ae lkx + Be- ikx )(ikA*e~ lkx - ikB*e ikx ) 

2m 

= — [|A | 2 - \B[ 2 ] = J I+ - Jj_ (8.132) 

m 
ih 

= — \(C*e~ i ^ x )(i^Ce %lx ) - (Ce^ x )(-ijC^e~^ x )] 

2m L 

= ^|Cf = J //+ (8.133) 

m 
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in terms of the 
(8.134) 


\A\ 2 = incident beam intensity 
\B\ 2 = reflected beam intensity 
\C\ 2 = transmitted beam intensity 

and therefore 


We then define the reflection and transmission probabilities 
currents as 

r-E 

Ji+ \A\ 2 Ji+ k |,4| 2 

Now, physically, we identify 


R = probability that an incident wave will be reflected 
T = probability that an incident wave will be transmitted 


We find 


Notice that 


(fc-7) 2 

' (fc + 7) 2 


and T = 


4 fcy 

(k + y ) 2 


R+T - l 


(8.135) 

(8.136) 


which makes physical sense since the total probability for the wave to go some¬ 
where must equal one. 


Before proceeding to the case E < Vq, let us recast the case E > Vo in terms 
of the wave packet formalism we derived earlier. This will allow us to make 
particle interpretations. 


Remember we can construct a wavepacket from any traveling wave solution by 
generating a linear combination of the traveling wave using only a restricted 
range of momentum values. In particular, the incident wave packet is a linear 
combination of incident waves 

V’ inc(x,t)= f -^zf{p)A{p)e l( ' px ~ Et)lh where E= , p= hk (8.137) 
Jo Z7r n 2 m 

Similarly, the reflected wave packet is a linear combination of reflected waves 

iprefi{x,t ) = f J^rf(p)B(p)e~ i< ' px+Et ' ,/h where E = , p = hk (8.138) 

Jo 27m 2m 

We assume, for simplicity, that f{p) is a real function that is nonzero only in 
a limited range of p- values and has a maximum near p = p 0 . We also choose 
A = 1. Our solution indicates that B(p) is a real function also and thus does 
not contribute to the phase of the integrand, where the phase is the part of the 
integrand of the form 

e i phase ( 8 . 13 9 ) 
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As we discussed earlier, the integrals are nonzero only when the phase of the 
integrand is near an extremum at the same place that the rest of the integrand 
is large(near p = po by assumption since we can always choose the form of f(p) 
so that it dominates). Otherwise, the exponential terms oscillates so rapidly 
(h is so small) that the integral averages to zero. This is called the stationary 
phase condition. 

For the incident and reflected waves the stationary phase extremum argument 
gives 


Tr[px-^—t\ = 0 = x - — t = x - vot 
dp \ 2m I to 

d ( P 2 \ „ Po 

—— I px H- t = 0 = X + — t = X + Vnt 

dp \ 2 TO ) TO. 

r ' ' P=PO 


incident wave 


reflected wave 


(8.140) 

(8.141) 


These equations tell us the location of the maximum of the wave packet (or 
pulse) given by \ip(x,t )| 2 in space as a function of time. We have 


Xinc = v 0 t , t< 0 


(8.142) 


which says that the incident packet or particle arrives at x = 0 at t = 0. In 
this model, a particle is a localized lump of energy and momentum where the 
localization is described by |^(a;,t)| 2 . 

Note that we obtain the correct kinematic relation of a free particle as we should. 
Similarly, for the reflected wave we have 


Xrefl=~Vot 5 t>0 (8.143) 

which says that the reflected packet or particle leaves x = 0 at t = 0 . 

For a finite step barrier of this kind, the reflected packet leaves at the same time 
that the incident packet arrives. 

The transmitted packet is given by 

Ifcran. (*,*)= f ” f (p) C(p)e^ Et ^ h (8.144) 

Jo 27m 

p 2 

where E -Vo - - , p-h 7 

2 TO 
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Since C(p ) is a real function, the stationary phase extremum argument gives 


r P (' iX -h , - v '’ t ) 

1 \ tP=PO 


n dl 
= 0 = x — 

op 


m 


= x 


d\/p 2 - 2 mVo 


dp 

Po 


P=P 0 


-^t 
m 


y/po~ 2mVb m 


„ _ To _ - 

%trans ~ ^ — VqL 

m 


transmitted wave 


(8.145) 


This says that the transmitted wave moves in the region x > 0, t > 0 with a 
speed 

7o ^ Po / Q i 

Vo = — < — (8.146) 


as it should. 


Summarizing, this wave packet analysis says that the probability amplitude for 
the particle being at a point in space and time is completely localized in the 
incident wave for t < 0 as the packet travels towards x = 0 from the left (it arrives 
at x = 0 at t = 0). For t > 0 the packet then splits into two packets, namely the 
reflected and transmitted packets, such that the probability amplitude for the 
particle is now localized in two regions. The localized packet traveling towards 
the left represents the probability of the particle being reflected and the localized 
packet traveling towards the right represents the probability of the particle being 
transmitted. 


Now what happens if E = p 2 /2m < Vo? We then have the solutions (let A = 1) 

a; < 0 ?/>/ = e ikx + Re~ ikx (8.147) 

a; > 0 $ n = Se- 0x (8.148) 


where hk = \J2mE and h(3 = •y/2m(Vo - E). We have excluded the mathematical 
solution e /3x for x > 0 because it would cause the integral 



(8.149) 


to diverge, which means it is a non-normalizable or unphysical solution. This is 
an example of a solution not being well-behaved. At x = 0, we have the continuity 
conditions which give 


1 + R = S and 1 - R =- S or 

ik 

ik- p ik - fd 


(8.150) 
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Since the solution for x > 0 is not a traveling wave, there is no transmitted 
packet. The incident wave packet analysis is the same as before. The reflected 
wave packet analysis is different because R(p) is not a real function. In fact, 


|i?| 2 = 1 R = = 


ip + \j2mVo - p 2 


or 


with 


ip - 


\J2mV 0 -p 2 


ip + \j2mVo - p 2 ip + \J2ijiVq -p 2 
ip - \j2mVo - p 2 ip + \J 2?7tVq - p 2 


e # = 


p 2 - mV o - ip\J 2mVo - p 2 


mV 0 


= cos 6 + 1 sin ( 


p 2 - mVo . 

cos 0 =-—— and sin <p = 


P\j2mV 0 -p 2 


mV o mV o 

Therefore, we have as wave packet representations 


Anc(x,t) = f°° ^rf(p)f(p)e^ x - Et)/h where E=^,p 
Jo 2 tt n 2 m 


= — ,p=hk 


(8.151) 


(8.152) 

(8.153) 


(8.154) 


refi(x,t ) = (8.155) 

p 2 

where =-, p = hk 

2m 

The stationary phase argument implies 

x = —t = vot for the incident packet 
m 

which says that the incident packet arrives at x = 0 at t = 0 and 

for the reflected packet 


m op 


This says that the reflected wave packet leaves x = 0 at 


hm d(p(p ) 

^ = ^delay ~ 


Po dp 

which is NOT the same time that the incident packet arrived! 
Continuing we have 

9 cos (f>(p) . . d(j)(p) 2p 

= -sm0(p)- 


(8.156) 


(8.157) 


(8.158) 


dp 


dp mV o 


(8.159) 
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(8.160) 


which gives 


Now, we found that 


d(j){p) = 2 

dp ^2 mV 0 -p 2 0 

mh 2 2 to 1 

Po y/2mV 0 ~ pi Po P 
<hi(x) = Se Px 


(8.161) 

(8.162) 


which says that the probability amplitude is significantly different from zero up 
to a distance d = 1//3 approximately. In other words, 


^ii(x = 1/fi) = -'ipnix = 0) \^ii{x = 1//3)| 2 * = 0)| 2 (8.163) 

e 7 


Therefore, to a good approximation, we have the striking result tdelay = 2d/vo- 
What does this mean? It seems as if there is a nonzero probability for finding 
the particle in the classically forbidden region x > 0, where K = kinetic energy 
= E - Vo since il>n(x) = e -/3x + 0 there. Since K cannot be less than zero, it 
seems like we must have a violation of conservation of energy if the particle were 
actually found in the region x > 0. This is NOT the case. 


If we observe the particle in this forbidden region, then it will no longer be in 
a state with E < V r 0 . The act of measuring the location of the particle must 
necessarily introduce an uncertainty in p and hence in E. The particle seems to 
have an appreciable probability to exist in the region up to a distance d = 1//3. 
If we observe the particle on the right, we have then localized it such that 
Acc « 1 1(3. This says that we have introduced an uncertainty in p 


a h 
A p > —— « ft/3 
Ax 

and a corresponding energy uncertainty 


A E 


(A p ) 2 

2 m 


ft 2 /3 2 

2 m 


= V 0 -E 


(8.164) 


(8.165) 


This implies that E is now uncertain enough that we can no longer claim that 
energy conservation has been violated! Quantum mechanics has a way of cov¬ 
ering its own tracks! 


8.5.2. Tunneling 

We now change the potential energy function so that the barrier (step) is not 
infinitely thick (which prevented a traveling wave from existing for E <Vq , x > 
0. The new potential energy function is shown in Figure 8.5 below. 
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V=v 0 

I 

II 

III 

-► 


Figure 8.5: Finite Step - Finite Width barrier 


For E > V o the results are similar to infinitely thick barrier and no new physical 
ideas appear. For E < Vo, however, we get some very interesting new physical 
results. As we shall see, it turns out that a real traveling wave can appear on 
the other side of the barrier (even though there are no sources on the side) in 
this case. This is called quantum tunneling. Let us see how it works. 

We have three regions I, II and III to consider as shown in Figure 8.5. 

We get the solutions 


X ~ 2m dx 2 

V>/ = A ie ikx + B ie ~ lkc 


= Elf)! 


E=i y_feV ( 8,1 66 ) 

2m 2m 


0 < x < a - 7 T- ~TEr + V oi>n = Etpu 
Zm ax z 

if)a = Ce lx + De~ lx , V 0 - E = ^ , 7 real (8.167) 

Zm Zm 


x > a 


h A d 2 il> in 
2m dx 2 

%I>iii = A 2 e ikx + B 2 e~ ikx 


= Ell) ln 


E = 


2 ;„2 


v z h 2 k 
2 m 2m 


k real 


(8.168) 


The probability current does not vanish at x = ± 00 , which implies that we must 
assume the existence of distant sources and sinks of probability (or particles). 


We have two sets of continuity equations (at x = 0 and x = a). At x - 0 we get 


*(0) = *i(0)^A 1+ Bi = C+D 

^ ; (°)^„(0 ) ^ ik{Al _ Bl )= 1 (C-D ) 
dx dx 


(8.169) 

(8.170) 
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and at x = a we get 

ipn(a) = fan (a) - Ce 7 “ + De ia = A 2 e ika + B 2 e ika 

#//(a) chjjiu(a) ^ 7(Ce70 _ £, e -7<*) = ik (A 2 e ika - B 2 e~ lka ) 
dx dx 

We can restate these equations in matrix form. At x = 0 we get 



and at x = a we get 



(8.171) 

(8.172) 


(8.173) 


(8.174) 


The transmission/reflection properties of the barrier are given by the coefficients 
A 1 ,Bi,A 2l B- 2 , which we assume are related by a transfer matrix Y, where 


:)-*(*) 


We then have 


* (d) = ’(d) = 


or 


and thus 




A great deal of algebra gives 


Wi = e 


ika 


= M^M 2 M^Mi 
Y = M^M 2 M^M 4 


cosh ya + —i sinh ya 


( 

\ B V 


\ k 7 /. 


y 12 = i e -‘“ s mh 7 a(lA) 

Y 21 = ~-ie ika sinhya (- + —\ 

2 V k y/ 

cosh ya - - * sinh y a ^ j 


r 22 = e 


(8.175) 

(8.176) 

(8.177) 

(8.178) 

(8.179) 

(8.180) 

(8.181) 

(8.182) 

(8.183) 
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If we have sources only on the left of the barrier, then we must choose B 2 = 0. 
The currents are 


Jleft{x,1) - — [\Ai\ 2 - |£>i| 2 ] 

TO 

= JLEFT+(x,t) - J L EFT-(x,t) X<0 

hk 

JrIGHt{xR ) = JRIGHT+(x,t) = - \A 2 \ 2 X > 0 

TO 

The reflection and transmission probabilities are given by 

M JM t |Aa|2 1 

1 ' |Ar|2 ' |Yh| 2 ’ ' {Atf “ |F n | 2 


(8.184) 

(8.185) 


(8.186) 


Algebra shows R+T = 1 as it must in order to conserve probability (or particles). 
Evaluating the expression for T we get 


T = 


1 

Vq sinh 2 7 a 
1 + 4 E(V 0 -e) 


2m 

Jr 


(Vo - £7) 


(8.187) 


The fact that T > 0 for E <V$ implies the existence of tunneling. The probability 
amplitude leaks through the barrier. 


It is important to realize that the fact that T > 0, DOES NOT say that particles 
or wave packets passed through the barrier. No measurement can be done on 
the system that will allow us to observe a particle in the region 0 < x < a with 
E < V 0 , since this would violate energy conservation. 


It is ONLY probability that is leaking through. If this causes the probability 
amplitude and hence the probability to be nonzero on the other side of the 
barrier, than it must be possible for us to observe the particle on the other side, 
i.e., we can observe the particle on the left side of the barrier with E < Vo, but 
we can never observe it in the region of the barrier with E <V q. That is what 
is being said here. 


8.5.3. Bound States 


Infinite Square Well 

We now consider the potential energy function 


0 


%<x< 


oo |a:| > 2 


V(x) = 

This is the so-called infinite square well shown in 


a 

2 


Figure 8.6 below. 


(8.188) 
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This is an example of a potential that is infinite in an extended region. There¬ 
fore, we must require that the wave function i/>(x) = 0 in these regions or the 
Schrodinger equation makes no sense mathematically. In this case we have 


In region II we have 


ipi(x) = 0 and ipm(x) = 0 


h 2 d 2 i/>ii p 2 h 2 k 2 

2m dx 2 2m 2m 


(8.189) 


(8.190) 


iPh(x) = Ae ikx + Be ikx (8.191) 

The continuity of the wavefunction at x - ±a/2 says that we must have 




= Ae~ ika/2 + Be ika/2 = 0 


= Ae ika/2 + Be~ ika/2 = 0 


which imply that 


^ —ika „ika 

- = -e =-e 


(8.192) 

(8.193) 


(8.194) 


This is an equation for the allowed values (values corresponding to a valid solu¬ 
tion) of k. This equation is 

£ 2ika _ 1 (8.195) 

The allowed values of k form a discrete spectrum of energy eigenvalues (quan¬ 
tized energies) given by 


2 kn a = 2mr -» k n =- * En - 


rnr „ h 2 k 2 n 2 ir 2 h 2 


2m 2ma 2 


n = 1,2,3,4, 


(8.196) 


The corresponding wave functions are 


ip} n /(x) = A n (e ik " x - e - ik ^e~ ik - x ) 


_ a -ik n a/2( ik n (x+a/2) -ik n (x+a/2)\ 

- s± n e {e e ) 

= A n smk n (x + a/2) 


(8.197) 
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where A n is determined by the normalization condition 

a/2 


r a i A 

/ . Vl>n{x)Y 

J-a/2 


dx = 1 


Substituting the value of k n from (8.196) into (8.197) we get 

ipjWx) = Ai sinfei(a; + a/2) = A\ sin7r(x + a/2)/a 




>!?( 


or 


(8.198) 


Ai sin ( nx/a + tt/2) = A 1 cos (nx/a) 

(8.199) 

A< 2 . s'mk 2 (x + a/2 ) = A -2 sin27r(a; + a/2)/a 

A 2 sin (27 rx/a + n) = A 2 sin ( 2 irx/a) 

(8.200) 

A 3 smk 3 (x + a/2 ) = A 3 sin37r(a; + a/2)/a 

A 3 sin (37 rx/a + 3 tt/2) = A 3 cos (37rx/a) 

(8.201) 

sin(ri7rx/a) n even 

1 1 1 \ 11 n ~ I’ 2’ 3,4, • • • 

1 cos ( nnx/a) n odd 

(8.202) 


ipn(x) = 


We have mathematically solved the ordinary differential equation problem, now 
what is the physical meaning of these results? 

We find a discrete spectrum of allowed energies corresponding to bound states of 
the Hamiltonian; the energy is quantized. Bound states designate states which 
are localized in space, i.e., the probability is large only over restricted regions 
of space and goes to zero far from the potential region. 


The lowest energy value or lowest energy level or ground state energy is 

^ n 2 ti 2 
E 1 = --- > 0 


with 


ipi(x) = 


2 ma 2 

^Hcos (nnx/a) \x\ < | 

lo 


N>1 


n= 1,2,3,4,... 


(8.203) 

(8.204) 


This minimum energy is not zero because of the Heisenberg uncertainty prin¬ 
ciple. Since the particle has a nonzero amplitude for being in the well, we say 
that it is localized such that Ax « a and thus 


A p > 


Ax 


(8.205) 


This says that the kinetic energy (or energy in this case because the potential 
energy equals zero in region II) must have a minimum value given approximately 

h ‘ (8.206) 


2m. 


2 ma 2 
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Let us look more closely at the wave functions. 

The integer n- 1 corresponds to the number of nodes (zeros) of the wave function 
(other than the well edges). 

They also have the property 

ip(-x) = ’ip(x) for n odd and ip(-x) = —ip(x) for n even (8.207) 

The above discrete transformation of the wave function corresponds to the parity 
operator V where we have 

Vip(x) = ip(-x ) = i/j(x) means even parity (8.208) 

'P'ip(x) = ip(-x) = means odd parity (8.209) 

Let us look more generally at the parity operation. Suppose that the potential 
energy function obeys the rule V (S) = V(~x) and let ip(x) be a solution of the 
Schrodinger equation with energy E 

(-^—V 2 + V(x)\'iIj(x)=EiI>(x) (8.210) 

\ 2 m ) 

Now let x —x to get the equation 

V 2 + F(-5)) ip(-x) = Eip(-x) (8.211) 

\ 2m / 

or 

— V 2 + V(x) | V>(-5) = Eijj(-x) (8.212) 

\ 2m ) 

This says that, if if>{x) is a solution of the Schrodinger equation with energy E, 
then ip(-x) is also a solution of the Schrodinger equation with the same energy 
E. This says that the combinations 

i^(x) ± ip(-x) (8.213) 

are also solutions of the Schrodinger equation with the same energy E. Now 

ip(x) + an even parity solution 

ip(x) - ’ipi-x) an odd parity solution 

This says that if V(x) = V(-x) (a symmetric potential), then we can always 
choose solutions that have a definite parity (even or odd). 

We formally define the parity operator by the relation 

(x\V\ip) = (-x\ip) (8.214) 
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Since 


(8.215) 


{x\V 2 1 ip) = (-x\V\ip) = {x\ip) 

we must have 

V 2 = / (8.216) 

which means the eigenvalues of V are ±1 as we indicated earlier. This also says 
that V- 1 = V 

We can show [ H, V] = 0 for symmetric potentials by 

f>H | E) = VE | E) = EV \E) = ±E\E) 

HV\E) = ±H\E) = ±E\E) 

(VH - HV) \E) = 0 
[H,P] = 0 

since | E) is an arbitrary state. As we saw earlier, this commutator relationship 
says that 


HV = VH 
VHV - V 2 H = H 

V^HV = H (8.217) 

which means that H is invariant under the V transformation. We have used 
V 2 = / in this derivation. It also says that 

H(V\E)) = VH\E) = E(T\E)) (8.218) 

or V | E) is an eigenstate of H with energy E as we stated above. The concept 
of parity invariance and the fact that H and V |i?}share a common set of eigen¬ 
functions can greatly simplify the solution of the Schrodinger equation in many 
cases. 


The Finite Square Well 


We now consider the potential energy function 


V(x) 


\-v 0 M<§ 

(0 |*| > f 


(8.219) 


This is the so-called finite square well (in one dimension) and it is shown in 
Figure 8.7 below. 
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The solutions are: 

Region I : x < -1 

h 2 ^2 / 

-—-^f=Eil> I ,0>E>-V 0 , ft 2 fc 2 = 2m|£?| , £ = -|£| 

zra 

with solutions 

^i{x) = Ae~ kx + Be kx (8.220) 

Since a: = -oo is included in this region, we must exclude the e~ kx term by 
choosing A = 0, which gives 


^(x) = Be kx X < ~2 


( 8 . 221 ) 


Region II: - § < x < | 


i2 

-- -T^r- Vofai = ErPu ,0 > E > -V 0 ,p 2 = 2m(V 0 - |£|) ,E = -\E\ 

2m ax z 


with solutions 


Region III: x > § 


V>j(x) = Ce ipx/h + De~ ipx/h 


( 8 . 222 ) 


-PPrf- = Eipm ,0>E>-V o , h 2 k 2 = 2m|£7| ,E = -\E\ 

2m dx z 

with solutions 

V>/(x) =Fe kx + Ge~ kx (8.223) 

Since x = +oo is included in this region, we must exclude the e kx term by 
choosing F = 0, which gives 


Mx) = Ge~ kx x>- 


(8.224) 


These results represent a general solution to the problem. There seems to be 4 
unknown constants, namely, B, C, D , and G. However, since V(x) = V(—x), 
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parity is conserved and we can choose even and odd solutions, or solutions of 
definite parity. 


Even parity implies ip(x) = tp(-x) or G = B and C - D. This solution is 


Odd parity implies ip( x ) 




fCcos px/h 

<3|<M 

VI 

IT 


V(x) 

= < 

Be~ kx 

x > I 

(8.225) 



[Be kx 

*<"§ 


= -V>( 

-x 

) or G = -B 

and D - 

-C. This solution is 



C sin px/h 

M<f 


V(x) 

= i 

Be~ kx 

*> i 

(8.226) 



r Be kx 

*<-§ 



Thus, by using parity we reduce the number of unknowns in the problem to two 
for each type of solution. We now impose the continuity conditions of the wave 
function and its derivative only at x - a/2 for both solutions. Since these are 
definite parity solutions the continuity condition at x = -a/2 will give no new 
information and is not needed. 


Even Parity Results 

Ccospa/2h = Be~ ka/2 and - ^Campa/2h = -kBe~ ka/2 (8.227) 

h 

or 

- = e ka ' 2 cospa/2h = ^ e ka/2 smpa/2h (8.228) 

C hk 

so that 

p tan pa / 2h = hk (8.229) 

This last equation is a transcendental equation for E and its solutions determine 
the allowed E values for the even parity states for this potential energy function. 
These E values are the even parity energies or energy levels of a particle in the 
finite square well potential. 

Odd Parity Results 

Csinpa/2h = Be~ ka ^ 2 and fracphC cos pa/2h = -kBe~ ka ^ 2 (8.230) 

or 

— = e ka ! 2 smpa/2h = -fracphke ka ^ 2 cospa/2h (8.231) 

o 

so that 

pcotpa/2h = —hk (8.232) 

Again, this last equation is a transcendental equation for E and its solutions 
determine the allowed E values for the odd parity states for this potential energy 
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function. These E values are the odd parity energies or energy levels of a particle 
in the finite square well potential. 

In general, at this stage of the solution, we must either devise a clever numerical 
or graphical trick to find the solutions of the transcendental equations or resort 
to a computer. 


The first thing one should always do is change variables to get rid of as many 
extraneous constants as possible. In this case we let 

(3 = ka = —\j2m\E\ , a = 7 a = = —y/2m(Vb - \E\) (8.233) 

h h n 

The first useful equation we can derive is 

a 2 + (3 2 = -——^— = constant for a given well (8.234) 

h 2 

This is the equation of a circle of radius 

< 8 - 235 > 

With these new variables the two transcendental equations are 
ot . a 

j3 = a tan — for even parity and fi - -a cot — for odd parity (8.236) 

We can find solutions graphically by plotting as shown in Figure 8.8 below for 
the case (effectively a choice of the quantity V 0 a 2 ) 


circle radius = = y (8.237) 

The solutions correspond to the intersections of the circle (fixed for a given 
well) and the curves represented by the two transcendental equations. They are 
shown in Figure 8 . 8 . 

For the choice of potential well shown in the figure we have 2 even parity solu¬ 
tions and 1 odd parity solution. These correspond to the allowed energy levels 
for this particular well and the corresponding wave functions and energies rep¬ 
resent bound states of the well. 
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Transcendental Equations 



alpha 


Figure 8.8: Solutions for = r f 


We can also do a straight numerical solution for even parity by rearranging the 
equations as follows: 


oT + /3 


2 


2mVoci 2 

h 2 


and f) = a tan 


a 

2 


a 2 (l + tan 2 = fraca 2 cos 2 | (8.238) 

a 2 - — T- cos 2 ^ = /(a) = 0 (8.239) 

The numerical solution of this equation can be carried out by any standard 
technique (Newton-Raphson method, for instance) for finding the zeros of the 
function /(a). For this case we get 


a = 2.4950 and 7.1416 


(8.240) 


which is clearly in agreement with the graphical result. 
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Transmission Resonances 


As we have seen the spectrum of energies for the square well is made up of a 
finite number of bound-state levels. The most important feature of a bound- 
state wave function is that it is localized in space,i.e., it falls off exponentially 
as we move away from the well. 

Another interesting feature of the square well has to do with the continuous 
part of its energy spectrum. All energy states with E > 0 are allowed. Again 
there are three regions to consider, labeled in the same way as earlier. We will 
again assume that we have a wave incident from the left (from x = -oo) with 
unit intensity 

h 2 k 2 

V>/(x) = e lkx + Be~ ikx where E = — (8.241) 

2 m 

where B is the amplitude for the reflected wave. 

In region II, which is over the well, we have the solution 

i/>n(x) = Ce iklX + De~ iklX where E + V 0 = ^ (8.242) 

2m 

In region III, we only have a transmitted wave as the solution 

ipin(x) = Fe lkx where E = —— (8.243) 

2m 

Again, we must match the wave functions and their derivatives at x = ±a/2. 
This gives us 4 equations for the 4 unknown coefficients. Usually an author now 
says that lots of algebra gives us the transmission coefficient and writes down 
the answer. Let us actually do it just once. 

Continuity of the wave function and its derivative at x = -a/2 gives 

g-ifca/2 + jj e ika/2 _ ^g-ifeia/2 + £)gifcia/2 

ike~ ika / 2 - ikBe ika/2 = ik x Ce~ ikial2 - ikiDe ikia/2 
Continuity of the wave function and its derivative at x = +a/2 gives 


p e ika/2 _ ^igifcia/2 + jj^ikxa/2 

ikFe ika l 2 = ik 1 Ce ikia ! 2 - ik x De ~ ikial2 

Then 

ik 1 e ~ ika ! 2 + ik 1 Be ika ! 2 = ik x Ce ~ ik ^ 2 + ik 1 De ikia ! 2 
ike~ ika l 2 - ikBe ika/2 = ik\Ce ~ ikia ! 2 - ikiDe ikia/2 
ikiFe ika/2 = ihCe ikia/2 + ikiDe ~ ikial2 
ikFe lka/2 = ifciCe ifclo/2 - ikiDe ~ lkia/2 
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Solving for C and D we get 

2ik\C e~' lkia ^ 2 = i(k + kp)e~ lka ^ 2 - i(k - kp)Be lka / 2 
2ik 1 De ikia/2 = -i(k - ki)e~ ikal2 + i(k + k x )Be ikal2 
2ik 1 Ce ikia/2 = i(k + ki)Fe ikal2 
2ik 1 De~ ikia/2 = -i(k - ki)Fe ikal2 


Rearranging we have 

_ I k+k i \ jp-ikia/2 Aka/2 _ Ak\a/2 [ / k+k i \ -ika/ 2 _ / k—k\ \ td ika/2 | 

° “ V 2fei ) C C C [I 2fci / V 2fe i / e J 

^ k-k\ ^ jpp^ikia/2gika/2 _ ik\a/2 ^ fc-fci ^ ^-ika/2 _j_ ^ k+k\ ^ 

+ 


Therefore 


( k+k i \ p„-ifeiO 

\ 2ki J re 


0 ika/2 


/ fc+fci \ -ika/2 _ _ ( k—k\ \ q ika/2 
2k\ ) " ^ ^ \ 2fci j C “ \ 2fci / " 

( k-ki A rp ik\aAka/2 , / fc-fci \ -ika/2 _ ( k+ki \ td Aka/2 
2k! ) r C e \ 2 fei / C “ V 2 fei j C 


Dividing we get 


or 


(fc+fci) 2 [Fe 


( fc+fci \ jp -ikia 
2ki ) r 

-( k it) Feikl, 


t ^-ik\a ^ika/2 


Aka/2 ( k+ki \ -ika/2 

C V 2/ci / e 

f fe-fei \ 
l 2fcl ) 

ipika/2 1 / k—ki \ p —ika/2 / 

k+k! \ 

V 2k! ) \ 

k 2fei J 

5 -ifea/2j = _ (jfe _ y^) 2 [_p> e <fcia e <fco/2 

a - (k- ki) 2 e* felQ j e ifca/2 F 



Finally we solve for the transmission coefficient F to get 
F = 


(k + ki) - (k - kiY e _ ika 


(k + ki) 2 e~ ikia - (k- ki) 2 e ik ' a 
4fcfci 

(fc 2 + k\) ( e ~ ikia - e ikia ) + 2kki{e~ ikia + e ikia )' 

_-ika 


-ika 


cos/cia _ | sinfcja 

and the transmission probability 

T(E) = \F(E)\ 2 = 


sin V^ Vo ° 2(1+ -f } 

■ ^ 4 (#)(^) 


(8.244) 
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Tranmission Over a Square Well 



Figure 8.9: Transmission over a Square Well 


A plot of T(E) versus E is shown above for the case 

2mVoa 2 25n 2 

The peaks are called resonances. They occur when T(E) = 1 or 

sin V§?’ / ” <,2(1 + t ) - 0 

or when 

^/w V oa 2 ( 1+ f) = mr 

E _ n 2 7r 2 i _ 4n 2 _ i \ n 

Vo “ ^Vbo 2 25 1 - U 

^ = 0.44,1.56,3.00,. 

in agreement with the diagram. 

To a particle of these particular energies, the potential looks completely trans¬ 
parent', there is no reflected particle 

R{E) = \B(E)\ 2 = 0 (8.247) 

All of the incident wave is transmitted into region III. 

A special feature of these amplitudes, not derivable from non-relativistic quan¬ 
tum mechanics, relates the properties of the transmission probability and the 
bound state energies of the well. 

The transmission amplitude F(E) is a function of the energy E. It was derived 


(8.245) 


(8.246) 
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for E > 0. If we assume that it is an analytic function of E, then it has some 
interesting properties. In particular, in the regions of E < 0 and real, F(E ) has 
poles (goes to infinity) at E values that correspond to the bound state energies 
of the well. 

F(E) being infinite corresponds to having a transmitted wave without having 
an incident wave. This is exactly the condition to have a bound state, where 
the transmitted(and reflected) wave does not propagate, but instead fall off ex¬ 
ponentially. 

The poles of F(E ) occur when 

cos k\a = f + jr'j sin k\a 
k\ cot ^ = ik and k\ tan ^ = -ik 

which are the same transcendental equations we had earlier for the bound states 
of the corresponding finite square well. 

This property can be used to find bound state energies after solving the trans¬ 
mission problem for a given potential. 

What do wave packets say? 

If we construct wave packets representing the incident, reflected and transmitted 
particles using the formalism we demonstrated earlier, we find the following 
result. 

Away from a resonance energy, the transmitted and reflected packets do not 
exhibit any strange behavior, i.e., the reflected packet forms and starts to move 
in the —x direction from x = -a/2 at about the same time the transmitted packet 

forms and starts to move in the +x direction from x = a/2. This time is on the 

order of the classical transit time for a particle moving across the well 

. a ma . 

At classical = ~ = JT~ (8.248) 

v nk i 

However, near any resonance, this behavior changes dramatically. 

The wave seems to get bottled up in the vicinity of the well, as if some temporary 
or metastable intermediate state forms. After a time larger than the classical 
transit time across the well, the reflected and transmitted packets reform and 
start moving away. An example is given below. 

Time Delay at a Square Well 

! wave packets ! time delay We consider the transmission through a square well 
as shown below in Figure 8.10. 
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v = o 


II 


x 2 


x= $ 
x 2 


III 


V = -Vn 


Figure 8.10: Finite Square Well 
The transmission amplitude is (from our earlier calculation) 

i/ca 


F(E) = 


cos k±a - | sinfcia 


= 1*1 


je 


where we have set 


We then have 


-ika i tan 1 (-5 ( ^ + ^ ) tan fei a) _ id 


, . 1 I k k\\ 

tan(ft + ka) = - I — + — I tanfcia 
2 \/ci k ) 

Using trigonometric identities we get 


tan fl+tan ka 
1 -tan 6 tan ka 2 


l(iU + ^) tanfcia 


tan# + tan ka = \ tanfcia (1 - tan#tanfca) 


and 


tan# (1 + - ( — + -r-^ tanfcia tan ka\ 

\ 2 \ fcr k ) ) 

1 / fc fci \ 

= 2liU¥r antl “- ta ' 

( Ff + T ) tan " tan ka 


tan# = 


1 + I (iu + Ife 1 ) tanfcqatanfca 
Now the incident wave packet is given by 

IpincidentM = f f{k)e< kx ~^ k ^ dk 
and the transmitted wave packet is given by 

^transmitted^) = J /(k)^^^ |F(fc)|e^ dk 


(8.249) 

(8.250) 

(8.251) 

(8.252) 

(8.253) 

(8.254) 

(8.255) 

(8.256) 
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where f{k ) is nonzero only in a limited region about k = ko and is peaked at 
k = ko- The stationary phase argument then gives for the incident wave 


x = Vo t 


(8.257) 


and for the transmitted wave 


x - Vo t - 


dd 

dk 


k=ko 


(8.258) 


Therefore, the incident wave packet arrives at x = -a/2 at t = -a/2 vq and the 
transmitted wave packet leaves x = +a/2 at 


a 1 dO 
t =-h - - 

2vo Vq dk 


k=ko 


V 0 = 


hko 


m 


(8.259) 


Therefore, we have a quantum mechanical time delay over and above A t c i ass i ca i 
given by 

T = ~ ~JT (8.260) 

vo dk 


Now 


k=ko 

d tan 9 d tan 9 d9 1 d8 

dk d9 dk cos 2 9 dk 

d9 9 d tan 9 Id tan 9 

— = cos 9 -=-x- 

dk dk 1 + tan 2 9 dk 


or 


where 


or 


tan# = 


\ ( ik + 7r) tan ki a ~ t an ha 

1 + \ + jr'j tankiatanka 


h 2 k 2 h 2 kf 

E = —— and E +Vq = 


2 TO 


ki^k'-d 2 ^. 

h z 


2 m 


(8.261) 


(8.262) 


(8.263) 


One can show(Messiah) that these solutions have the following properties: 
1. Resonances (when T = 1) occur when 

k\a = rnr -*■ tanfcia = 0 


2. The time delay due to quantum mechanics near resonance approximately 


given by 

_ Ro 

7” — 2j^1~classical 

(8.264) 

where 

vna . . 

Tciassicai = ~t 7 - = classical transit time 
hk 

(8.265) 
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This corresponds physically to the wave (or particle) remaining in region 
II (the vicinity of the well) during this time delay. 

At that time the wave packet splits up and the reflected and transmitted 
packets reform and propagate. 

3. Off resonance, the transmission probability is very small and the wave 
practically does not penetrate the region of the well (it is totally reflected 
at the first potential discontinuity). 


8.5.4. Delta-Function Potentials 

We now consider the potential energy function 


V(x) = A5(x- a) 


where 


S(x - a) = 0 x + a 
1 f (x)S(x - a) dx = /(a) 


and solve the corresponding Schrodinger equation 


h 2 d 2 tp(x) 
2 to dx 2 


+ V{x)tp{x) = Etp(x) 


(8.266) 


(8.267) 


(8.268) 


As we discussed earlier the wave function tp(x) is assumed to be continuous for 
physical reasons relating to the probability interpretation. The derivative of the 
wave function, however, is not continuous at x = a for this potential. We can 
see this as follows. We have 


h 2 a + e J2 


J — ^ dx + A J S(x - a)V(x)ip(x)dx = E J 'i/;(x)dx (8.269) 


2m J dx 2 


In the limit e -*■ 0, using the continuity of ip(x), we get 

a+e 

Eip(a) J dx-Aip(ci) (8.270) 


h 2 _ 
2m 


dtp 

dx 


dtp 

a+e dx 


so that 


discontinuity ( — \ .A (® . ^(„) (8.271) 

V dx J x=a \ dx ) h 2 

For simplicity we choose a = 0. We then have two regions to consider 


region I x < 0 
region II x > 0 


and the derivative is discontinuous at x = 0. 
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Transmission Problem 

We first carry out the calculation of the transmission and reflection probabilities. 
We assume that A > 0(we have a delta function barrier), E > 0 and an incident 
wave of unit intensity coming in from the left. 


In region I we have 

_ = e ^ ji ^ = e ikx + Be -ikx with E = 

2 m dx 2 

We have both an incident and a reflected wave. 


2 7.2 


h 2 k 
2 m 


> 0 


(8.272) 


h 2 k 2 
2 TO 


> 0 


In region II we have 

d f? 1 = E ^ n i’n( x ) = Celkx with E = 

2 ?n ax z 

There is only a transmitted wave. 

The boundary conditions (at x = 0) give 

lM0) = ^7/(0)->l + B-C 


dMO) dMO) = an^ Il(0) _ ikc _ ik{1 _ B) = 


(8.273) 


dx 

The solutions are 


dx 


C = 


h 2 

ik 


h 2 


ik-ir 


and i? = 


mA 

h 2 




We then have 


2 1 

T = transmission probability =\C\ = 


1 + 


2 1 

R - reflection probability = \B\ = 


mA 2 
2h 2 E 


1 + 


2h 2 E 


(8.274) 


(8.275) 


(8.276) 


mA 2 

We note that T + R - 1 as it must for the probability interpretation to make 
sense. 


From our previous discussion, we suspect that the energy values of the poles 
of the transmission probability correspond to the bound state energies for the 
delta function well problem (A < 0). For the single delta function potential, T 
has a single pole at 


E = - 


mA 2 

2 h 2 


(8.277) 
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Bound-State Problem 


We let A -* - A , A > 0. In region I we have 


h 2 d 2 ^i 
2 to dx 2 


-\E\ip I ^ip I (x) = Be ax 


(8.278) 


with 

h 2 a 2 

E = -\E\ = -< 0 (8.279) 

2m 

We have excluded the negative exponential term since it would diverge in region 
I as x -* -oo. 


In region II we have 


h 2 d 2 il>n 
2m da; 2 




^n(x) = Ce" 0 * 


(8.280) 


with 

h 2 a 2 

£ = - |£| =-TL < 0 (8.281) 

We have excluded the positive exponential term since it would diverge in region 
II as x -» +oo. 


The boundary conditions give 


V’/(0)=V’//(0 )-B = C 


(8.282) 


= -|>z(0) - -aC - oB = ^AB (8.283) 
dx dx n z n- 

The resulting equation for a gives the allowed the bound state energies. We 
have 

a - —* only 1 solution only -*■ 1 bound state (8.284) 


E = -\E\ 


h 2 a 2 

2m 


mA 2 
2 h 2 


(8.285) 


which is the same value as we obtained from the pole of the transmission prob¬ 
ability. 


We also note that the solution has definite parity (even) since ip(x) = ip(-x). 
This must occur since V(x) = V(-x) and hence parity commutes with the 
Hamiltonian. As we also saw in the square well case, if only one solution exists 
then it is always an even parity solution. 
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Double Delta Function Potential 

We now tackle a more realistic and hence more complex system, namely the 
double delta function potential energy function given by 

V(x) = u 5 + - j + 5 - - j (8.286) 

We again start by considering the barrier transmission problem for u > 0. 


Transmission problem 

There are three regions to consider 

T i 

region 1 x < ^ 
region II - 1 < x < f 
region III x > | 

The solutions in the three regions are 

region I x < | i’i(x) = Ae lkx + Be~ lkx 

region II - f < x < § i/>ii(x) = Ce ikx + De~ ikx 

region III x > | 'fiir(x) = Fe* fe:c 


where 


h 2 k 2 

E= k >0 

2m 

The boundary conditions (the derivatives are discontinuous at x 
equations 

(8.287) 

= ±f/2)give the 

x = -- Ae~ ik i + Be ik % = Ce“ lfc * + De ifc * 

2 

(8.288) 

ik (Ce _ifc * - De ifc 3 ) - ik (Ae~ ik ^ - Be ik ^\ 


= -p- (Ce +De‘^ J 

(8.289) 

® = +- Ce ik i + De- ik i = Fe ik * 

2 

(8.290) 

ik(Fe ife 5) De" ifc 5) = (p e lfc ^ 

(8.291) 


We are interested in calculating the transmission probability 


I F\ 

TIE ) = (8.292) 

|A | 2 
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Much algebra gives these results 


c - (1 -^ )i7 “ dD 


mu 
ikh 2 


e iU F 


(8.293) 


F 

A 


1 

[! - - cos2M)] 2 + i [f + sin2M] 

1 

-1 _ mu l 2 _ r mu ] 2 
L ikh 2 J Ufcft 2 J e 


(8.294) 


The poles of the transmission probability occur for the zeroes of the denominator 
in (8.294). If we let u = -|u| and E = -\E\ < 0 and k = i-y, the poles are given by 


m |u| 
7 ft 2 


m |u| 
7 ft 2 


e“ 2 ^ 


= 0 


Some rearranging and algebra gives 


o7 § = m \u\ 


-fh 2 


& 


± e 


(8.295) 


(8.296) 


as the transcendental equation for the bound state energies of the double delta 
function well. 


We will solve this transcendental equation graphically. Again, it is always best 
to first clean up the equation as much as possible by changing variables. There 
are no rules for this part. We let 


t m\u\i 

p = 7 - and £ = -— 

; 2 2 h 2 


The transcendental equation then becomes 


e 


-2/3 



(8.297) 


(8.298) 


Bound-State Problem 

We now get the same equation in the standard way. 

In this case u < 0 and we choose 

E = ~\ E \ = ~^ (8 - 299) 
The solutions in the three regions are 

region I x < | ipi(x) - Ae ax 
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region II 
region III 

- f < x < | i/>ii(x) = Be ax + Ce~ ax 

x >| ipm(x) = De~ ax 


The boundary conditions (the derivatives are discontinuous at x 
equations 

= ±f/2)give the 

£ 

X “ ~2 

Ae~ a 3 + + De a * 

(8.300) 

a(Ce~ a 

5 - L>e“5 ) - a ^ 



2 mu ( _ a i t\ 

= ( Ce 2+D e“ 2 ) 

(8.301) 

i 

■' “ + 2 

Ce a 5 + De _a 5 = Fe“3 

(8.302) 

a(Fe a %) a(Ce a * De~ a i) = (Fe“*) 

(8.303) 

We consider two cases: 



even parity: A = D and B = C 
odd parity: A = -D and B = -C 


Much algebra leads to the transcendental equations 



even parity e = -1 + —— 

m \u\ 

(8.304) 


■t -t . rvf h? Ot 

odd parity e = +1-—- 

m\u\ 

(8.305) 

If we let 

l m\u\e 

p = 7 - and £ = -— 

H 2 2 h 2 

(8.306) 

the equations becomes 

even parity e _2/3 = + — lj 

(8.307) 


odd parity e -2 ^ = - — lj 

(8.308) 

which agree with the result we obtained from the poles of the transmission 
amplitude. 
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Graphical Solution 

Since 


m |it| l 
2 h 2 


(8.309) 


is fixed by the choice of the parameters of the potential, we have transcendental 
equations for [}. The solutions give us the energy eigenvalues. We can simplify 
the graphical solution with one more change of variables. We let 


and get 


Procedure: 



even parity e 2ev = 77-1 
odd parity e~ 2er> = 1 - 77 


(8.310) 

(8.311) 

(8.312) 


1 . plot 1 - 77 versus 77 and 77 - 1 versus 77 

2 . plot e~ 2ev versus 77 for several values of e 

3. the intersections are the even and odd solutions 


Double Delta Function 



Figure 8.11: Double delta Function Bound States 

In Figure 8.11 above, we plot the exponential function for several values of e. 
As can be seen, the properties of the solutions are: 


587 






1. Even solutions always exist 

2. Odd solutions do not always exists (see the e = 0.25 line, which only 
intersects the even curve) 

3. As e -* 0 which corresponds to t -»■ oo we get E q m = E even . Physically, the 
wells are separating so that they no longer influence each other and the 
system behaves like two separate wells. 

We will look at this property again later when we consider a translationally 
invariant infinite line of delta functions. The interaction between the wells 
coupled to the translations invariance will lead to energy bands as in real solids. 


8.6. Harmonic Oscillators 

We now turn our attention to a physical system with a harmonic oscillator 
potential energy function 

V(x) = -kx 2 = -uiojqX 2 (8.313) 

We will first solve this system by going to the position representation and solv¬ 
ing the Schrodinger equation using differential equation techniques. We will 
review all of our methods, introduce Hermite polynomials and define generating 
functions along the way. 

After that we will introduce an algebraic method for solving this system that 
does not make any reference to the position representation. In the next chapter 
we will generalize this algebraic method for use in other types of systems. 

8.6.1. Differential Equation Method 

The Hamiltonian for the system is 

~ d 2 1 

H = - \- -mui^x 2 (8.314) 

2m 2 

The energy eigenvalue equation is 

H\tp) = E\^) (8.315) 

We put this into the position representation by these steps: 

(x\H\ip) = {x\E\ip) = E (x | ip) = Eip(x) 

(x\ + \rnulx 2 j |V>) = Ei>(x) 

V 2 1 

(z| ^ Vl>) + 2 mw o ( x l x 2 |0> = Etjj(x) 
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ih y - ) 2 ( x \^) + ( x I V>> = ^(*) 

2 m ax 2 

d + \ muJ o x2 Tp( x ) = Eip(x) (8.316) 

The last line is the standard 1-dimensional time-independent Schrodinger equa¬ 
tion for the harmonic oscillator potential. 


When solving such equations it is useful to rewrite it in dimensionless form. We 
do this by making these substitutions (the choice is an art, not a science) 


a 


4 


2 2 

TO w 0 , , 
——-— , q=ax and A 
n 1 


2 E 
hujQ 


The equation (8.316) becomes 


(8.317) 


0 + (A- g 2 )^ = O (8.318) 

The solution of such equations is greatly aided by examining and extracting the 
behavior of ip in the asymptotic regions q -» ±oo. 

For sufficiently large q the equation becomes 

0-^ = 0 (8.319) 

which is satisfied by functions of the form 

q n e ± 2 q f or an y finite value of n (8.320) 

Since the probability interpretation of the function requires us to be able to 
normalize it, i.e., we must have 


J~ \ip(x)\ 2 dx < oo (8.321) 

— oo 

This rules out the positive exponential. This result suggests that we try a 
solution of the form 

ip(q) = H(q)e~^ q2 (8.322) 

where H(q) is a polynomial of finite order in q and we have explicitly extracted 
the asymptotic behavior. 

Substituting into the original differential equation gives a new differential equa¬ 
tion for H(q) 

-y-y - ‘Zq-j- + (A - 1)H = 0 (8.323) 

aq z dq 
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We now use the standard Frobenius series substitution method to solve this 
equation. We assume a solution of the form 

oo 

H(q) = q s ^ a j ( i'i where ao + 0 and s > 0 (8.324) 

3=0 

The nonnegative restriction on s is required in order to guarantee that the 
solution is well-behaved at q = 0 (we need to be able to normalize the wave 
function). If we substitute this guess into the differential equation for H we get 

9 S E a iU + s)(j + s- 1V 2 - 2 q s £ aj(j + s)q J + (A - 1 )q 8 f) djq 3 = 0 

3=0 3=0 3=0 

q s [aos(s - 1 )q~ 2 + d\s(s + 1 )g _1 + (a 2 (s + 2)(s + 1) - ao(2s + 1 - A)) g°] 

+ (a 3 (s + 3)(s + 2) — ai(2s + 3 — A))q +.+ 

(aj +2 (s + j + 2 )(s + j + 1 ) - ( 2 s + 2j + 1 - A )cij) q 3 + .= 0 

Now for a power series to be identically zero, the coefficient of each power must 
be zero separately. We thus have 


s(s - l)a 0 = 0 (8.325) 

s(s+l)oi = 0 (8.326) 

(s + 2)(s + l)a 2 - (2s + 1 - A)a 0 = 0 (8.327) 

(s + 3)(s + 2)a 3 - (2s + 3 — A)ui = 0 (8.328) 


(s + j + 2)(s + j + 1 ')cij +2 ~ (2s + 2j + 1 — A)cij — 0 (8.329) 


Now we assumed that ao + 0, s > 0. Therefore the (8.325) says that we must 
have s = 0 or s = 1. (8.326) says that s = 0 or aq = 0, or both. The remaining 
equations give a 2 in terms of ao, a 3 in terms of ai, and, in general, dj +2 in 
terms of a 3 . Whether the series will have a finite or an infinite number of terms 
depends on the choice of s, ai, and the eigenvalue A. 


If the series does not terminate, then its asymptotic behavior can be inferred 
from the coefficients of the high power terms. From the last recursion formula 
(8.329) we have 


dj+2 


2 


(8.330) 


dj 3j 


2 

This ratio is the same as that of the series for q n e q with any finite value of n. 
This implies that the solution we have found for the wave function 


ip(q) = H(q)e ^ 


(8.331) 


will diverge at large q. 
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Since we cannot have that happen and still retain the probability interpretation, 
the series must terminate ( H(q ) must be a polynomial as I mentioned earlier). 
From the last recursion formula (8.329)we can see that if we choose 

A = 2s + 2j + 1 (8.332) 

for some j, then the series will terminate (a J+ 2 = 0). Since ao + 0 this cutoff 
procedure requires that j equals an even integer. This means that the odd series 
will not terminate unless we choose a\ = 0. The index s can still be either 0 or 
1 . Corresponding to these two values, A is equal to 2 j + 1 or 2 j + 3, where j is 
an even integer. We can express both cases in terms of an integer n as 

A„ = 2n+ 1 (8.333) 


This gives us the energy eigenvalues 



(8.334) 


As in the square well examples, we have a zero point energy. It corresponds to 
the lowest or ground state energy value 


Eq - 


(8.335) 


The eigenfunctions have a definite parity, as they should since V(x ) = V(-x). 
From the definition of n, we can see that for a given solution 


n = highest value of s + j (8.336) 

If we denote the corresponding polynomials by H n (q) we see that H n (q) is of 
degree n in q. Therefore, since the exponential is an even function, we have 


n = even -*■ even parity solution 
n = odd -* odd parity solution 


The polynomial of order n, H n (q), is a solution of the original equation with 
A = 2n + 1 

d ^-2q^ + 2nH n = 0 (8.337) 

dq- dq 

It is called the Hermite polynomial. We could find the exact forms of the poly¬ 
nomials from the recursion relations. However, it more instructive to introduce 
the idea of a generating function. We define 

S(q, s ) = e« = e~ s +2s « = £ G n (q) S - (8.338) 

n.=f) n\ 


The functions G n (q ) defined in this way are polynomials in q. Now we have 


as 

dq 


2 00 

2se -s +2s « = £ G n (q) 

n =0 


2s n+1 

n! 


y dG n (q ) s n 

h) d( i n! 
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2 00 

= ( _ 2s + 2q)e~ s +2sq = E G n (q) 

° S n= 0 


(- 2s + 2q)s n ^ _ , x s" 


n\ 


= I G n (q) 


Z-/ / 1 \ 1 


Equating equal powers of q in each of these equation we get 
dG 

— 2nG n -i and G n +\ — 2qG n — 2nG n —i 


dq 


(8.339) 


The lowest order differential equation, which involves only G n (q), that we can 
construct out of this pair of equation is 


d 2 G n n dG n _! dG n+ i 


dq 2 


= 2 n 


dq ’ dq 


= 2(n+l)G„ 


dG n+ 1 dG ra dG n -1 

-= 2q—-—+2G„-2n- 


dq dq 


dq 


d 2 G n dG n dG n+ 1 dGn , . 

—r- 5 - - 2q——+2G n --— - 2q—-—+2G ra - 2 (n + l)G,i 

dq z dq dq dq 

2q^-2nG„=0 


dq 2 1 dq 

which is the Hermite polynomial equation. So G n (q) = H n (q ) and we have 


(8.340) 


S(q,s ) = e' 

We then have 

d n S(q,s) q 2 d n e~G-tf 


- -q ~( s ~q) - p~ s +2s q - 


= E ^n(g) 


n =0 


n! 


(8.341) 


9s 71 


9s 71 


=(-l) n e 9 


9 n e - (s -« )2 


dq n 

= E 


9" Z oc 

m=0 -r-r / \ & 


ds n (m-n)! 

where we have used the property 

df(s-q) = df(s-q) 
ds dq 

If we set s = 0 in the equation above we get 

„2 d n e~ q? 


(-l)-e 




dq n 


= H n (q) 


We can then easily generate the polynomials 

Hq = 1, Hi = 2q, H 2 = dq 2 - 2 , ... 


(8.342) 


(8.343) 


(8.344) 


(8.345) 
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(8.346) 


The Schrodinger wave functions are then given by 


i> n (x) = A n H n (ax)e 2 “ x 
We find the normalization constant A n by 

°o - . ,2 00 

J \ip n (x)\ 2 dx = 1 J Hl(q)e~ q2 dq 

— 00 —00 


(8.347) 


Now we can determine all of the integrals of this type from the generating 
function as follows. We have 


f e~ s2+2sq e~ t2+2tq e~ q2 dq = £ £ ^ [ H n (q)H m ( q ) e - q2 dq 

n=o m=o 

(2 st) n 


= V^e 2at = V^Z , 

7i=o n\ 


(8.348) 


which says that 

00 

J H n (q)H m (q)e~ q dq= ^n2 n n\6 nm (8.349) 

— OO 


The case n = m, gives 



(8.350) 


The case n + m just corresponds to the orthogonality property of eigenfunctions 
of different eigenvalues. 


8.6.2. Algebraic Method 

The Hamiltonian for this system is 

» 2 1 

H = -— + -mu q£ 2 (8.351) 

The energy eigenvalue equation is 

iJ|0} = £|?/>) (8.352) 

The commutator between the position and momentum operators is 

[ x,p] = ih (8.353) 

This information defines a complete solution to the problem when using alge¬ 
braic methods. 


We now define the two new operators (like a change of basis) 
mu 0 „ ip 


a = 


-x + 


2 h \j2mhuQ 


and a = 


mu 0 


-x - 


ip 


2 h \j2mhuQ 


(8.354) 
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X = 


(a + a + ) and p= -i 


mhcon 


(a- a + ) 


(8.355) 


2mwo 

These new operators are not Hermitian. They are Hermitian conjugates, how¬ 
ever. We then have 


[a, a + ] = 1 and H = huJo(N op + -) , N op = a + a 


(8.356) 


Now suppose that the vector | n) is an eigenvector of N op = a*a with eigenvalue 
n 

N op | n) = n | n) (8.357) 

Since N op and H differ only by a constant, this implies that | n) is also an 
eigenvector of H, i.e., 

H\n) = huj 0 (N op + ^) | n) = huj 0 (n+ ^) | n) (8.358) 

Therefore, | E n ) = |n) and the energy eigenvalues are 

E n = hiu 0 (n+ ^) (8.359) 

Our task now is to find the allowed values of n. 


Now Nl p = (a'i’a)' 1 ’ = a)a = N op , so N op is a Hermitian operator. This means that 
the eigenvalues of N op (and H ) are real numbers. Now we also have the relation 

n = (n\ N op |?r) = (n\ a + a \n) (8.360) 

If we let | <f>) = o|n), then the last equation (8.360) can be written 

n = (<fi | (f>) > 0 (8.361) 

which says that the eigenvalues of N op (the values of n ) are non-negative, 
rea/numbers. 


Finally, using [a, oJ] = 1, we have 


N op a = a + aa = ( aa + - 1) a = a ( a + a - 1) = a ( N op - l) 

(8.362) 

and similarly, 


N op a + = a + (N op + l) 

(8.363) 

These operator relations imply that 


N op a | n) = a(N op - 1) | n) = (n - l)a \n) 

(8.364) 

which says that a |?r) is an eigenstate of N op with eigenvalue n - 

1 or that we 

can write 


a | n) = a \n - 1) -» (n\ a + = a* (n - 1| 

(8.365) 
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We will assume that the eigenstates are normalized, i.e., (n\n) - 1. We then 
have 


((n| a + ) (a |?i}) = \af (n - 11 n - 1) = |a| 2 

= (n| a + a \n) = ( n\ N op \n) = n 

which says that a = \/n. and we have the relation 

a|n) = s/n\n - 1) 

Repeating this procedure we have 

a 2 | n) = a\/n\n - 1) = \fna \n - 1} = \/n(n - 1) | n - 2) 

which implies the following: 

if the eigenvalue n exists, then 
so do the eigenvalues n- 1, n- 2, . 

This sequence cannot continue forever since we have already shown that the 
eigenvalues of N op are non-negative. 

The only way out of this dilemma is that n is an integer. In this case, we would 
eventually arrive at an eigenstate and then get 

a |1) = |0} and o |0) = 0 (8.369) 

which says the sequence ends. Thus, the eigenvalues of N op are the non-negative 
integers. 

We can also go the other way since similar algebra shows that 
N op a + |?z) = a + (N op + 1) | n) = (n + l)a + |?z) 

This implies that, in the same way as before, 

a + |?r) = \/n + 1 \n + 1} 

Thus, the eigenvalues of N op are all the non-negative integers 
n = 0,1,2,3,4,. 

The operators a and aJ are called lowering/raising operators from their property 
of lowering and raising the eigenvalues by unity. They are also sometimes called 
ladder operators because they generate a ladder of equally-spaced eigenvalues. 


(8.370) 

(8.371) 


(8.366) 

(8.367) 


(8.368) 
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Finally, assuming that (01 0) = 1 we have 


|1} ‘ 71 |0> ‘ 7n |0) 


a' a + (a + ) 2 (a + ) 3 


|3) = ^-= |2> = |q) = 

n/3 n/3 n/2T s/ 3 \ 


10 ) 


(a + ) n 

|n)=^r|0) (8.372) 

Vfi! 

and 

E n = hu 0 (n + i) (8.373) 

Notice that we were able to derive all of these results without introducing the 
wave function (the position representation). If you are desperate to see the 
position representation wave functions they can easily be derived from this for¬ 
malism. We have 


0 = (x\ a |0) = (x\\/ 0 x + 


ip 


2 h \J2mhujQ 


10 ) 


mu} o , | , i nX 

— X (X 0) + 7 (x 0) 

2h ' 1 ' s/2mhuJo 


(8.374) 


This is a differential equation for (x|0) = if) o(x) which is the ground state wave 
function. We have 

+ - 0 (8.375) 

ax h 


which has the solution (when normalized) 

<*|0) = 1po(x) = 

which agrees with our earlier work. 


The higher energy wave functions are obtained using 
(x | 1) = ipi(x) = {x\a + |0) 


, mu o 

= <£C| A / ——X 


ip 


2 h \/2mhuQ 


10 ) 


2mu>o 


(x | 0 ) 


(8.376) 


(8.377) 


and so on. 
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The raising and lowering operator formalism is very powerful and will be very 
useful beyond non-relativistic quantum mechanics where they will allow the 
theory to represent the creation and annihilation of particles. 

Coherent States 

Let us now use this formalism to do an interesting exercise and show the power 
of the methods. 

We consider the following question: what are the eigenstates of the lowering 
operator a! That is, we write 

a\a) = a\a) with a= |a| e 1 ^ (8.378) 

where |a) is the eigenvector of a and a is the eigenvalue, which is not necessarily 
real since a is not Hermitian. 

Since the vectors |n) are eigenvectors of a Hermitian operator, they form a 
orthonormal complete set and can be used as an orthonormal basis for the 
vector space. We can then write 


oo 



|a) = Y b ™ l m ) 

(8.379) 


m= 0 


where 




{k 1 a) = Y, b m (k 1 m) = Y, b mhm = b k 

(8.380) 


m= 0 m= 0 


Now 

(n - \\a\a) - a (n - \ \ a) = ab. n ^i 

(8.381) 

and using 

a + \n - 1} = \fn \n) -*• (n - 1| a = \fn (n\ 

(8.382) 

we have 

(n - 1| a |a) = \/n (n \ a) = \/nb n 

(8.383) 

or 

bn — 7=^n-l 

\jn 

(8.384) 

This says that 

ol a a 2 

h ‘ 7t b ° ’ h " 75 b ' ‘ VT b ° 

(8.385) 

or 

a n 

b n = 

yn! 

(8.386) 

We thus get the final result 



■x. m 

l a ) = b 0 E /—r \ m ) 

m= 0 Vm\ 

(8.387) 
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Let us now normalize this state (choose bo). We have 



OO OO *m k 

{a | a) = 1 = |6o| X!E /—t /yr ( k 1 m > 
m=0 k=0 vmW k\ 



.X. CX. *m k 

- Ifcol E E /—| n-\^ km 

m=o k=o VmWk\ 

m=0 m - 

(8.388) 

which says that 

bo = e->< 2 

(8.389) 

and thus 

o° m 

l«) = e-lW £ ° \m) 

m=0 Vml 

(8.390) 

Now 




(n\a) = probability amplitude that the system in the state 
|a) will be found in the state | n) 


which then says that 


P n = |(n | a)\ 2 = 6 


~l a l \a\ 2n e~ N N n 


(8.391) 


is the probability that the system in the state |a) will be found in the state |?r), 
where we have defined N = |a| 2 . 


We note that 


(a| a + a |a) = |a 2 | (a \ a) = |a 2 | = N = (a| N op |a) (8.392) 

or N = the average value or expectation value of the N op operator in the state 
|a). This type of probability distribution is called a Poisson distribution, i.e., the 
state |a) has the number states or energy eigenstates distributed in a Poisson 
manner. 

Since the states |n) are energy eigenstates, we know their time dependence, i.e., 

\n,t) = e-^ln) (8.393) 

Therefore, we have for the time dependence of the state |a) 

-IN 2 g |m> (8.394) 

m= 0 V ml 


| a, t) - e 


E 


a 


=o \fml 


| ?n, t) = e 


598 



This simple operation clearly indicates the fundamental importance of the en¬ 
ergy eigenstates when used as a basis set. 

If we are able to expand an arbitrary vector representing some physical system 
in the energy basis, then we immediately know the time dependence of that 
state vector and hence we know the time dependence of all the probabilities 
associated with the state vector and the system. 

We will use this result to solve for the time dependence of real physical systems 
in later discussions. 

Now let us try to understand the physics contained in the |a) state vector. In a 
given energy eigenstate the expectation value of the position operator is given 
by 


(n, t\ x |n, t) 


2mcoo 


2muJo 


2muJo 


2mu>o 


(n, f| (a + a + ) |n, t) 

(n\ e l ~^ f (a + a + )e~ l ~^ t |n) 

(n| (a + a + ) | n) 

( n\ (\/n\n - 1} + \Jn + 11 n + 1}^ 


= 0 


i.e., it is equal to zero and is independent of time. 
On the other hand, in the state |a) we find 


(a, t\ x |a, t) 


o-L L b m he h 

2 muo m k 


(m\ (a + a + ) |fc) 


(8.395) 


Now 


(?n| (a + a + ) | k) = (m\ ^Vk\k- 1) + Vk + 1 \k + l)j 

— VkSm^k-l k + l<Irn,fc+l (8.396) 
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Using this result we have 


(a, t\x \a, t) 


2mwo 


h , rr i < - Ek - 1 ~ Ek) t 'w',* , r ,—7 ■ 

Y b k-ihVke h + Y, b k+i b kVk + le h 

k =1 fc =0 ) 


>/£ (I, - f^AVFTTe-) 

>/£ 


OS S 


oc a *fe a fe+l 


2mw 0 ^£0 vTfc+W 


N/fce- iwot + £ 


00 a* k+ 1 a k 




k^o \/(k + l)!fe! 

+ a'c™ 0 ") 


\/fe7Te iwoi j 

(8.397) 


Now using a = |a|e* v we get 

<M*|a,i> = \l~*—hl2\a\ Y ea^e^) 


, 2 k 


2mujQ 


I \ 2k f 

= 2x 0 \a\cos(u 0 t-(l>)(blY-jj-) > = y 

= 2a’o |a| cos(w 0 t - 4>) 


2mioo 


(8.398) 


The expectation value of the position operator in the state |a) is time-dependent 
and behaves like a classical oscillator. 

Now let us look at these states in another way. We consider adding a term to 
the Hamiltonian of the form 


so that 


V{x) = -F 0 x 


H = hu>o(a + a + — ) - FqXo(ci + a + ) 


(8.399) 


(8.400) 


This corresponds to putting a charged oscillator in an electric field and including 
the effective dipole interaction energy in the Hamiltonian. 

In the corresponding classical mechanics problem, we have 

p 2 1 9 

H = -+ —fear -» oscillator with equilibrium point at x = 0 

2m 2 
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and if we add a linear term in x, we get 


p 2 1 9 

H = -— + -kx 2 - FqX 
2m 2 


= — + -fc(x-^) 2 -^f 
2m 2 fc 2fc 


Fa 


-*■ oscillator with equilibrium point at x = — 

k 

This suggests that we look for an oscillator solution to the quantum mechanical 
problem. In particular, we look for a new set of raising and lowering operators 
with the same commutator as the original set. 


We can accomplish this by letting 

A = a + f3 and A + = a + + (3 where (3 = a real number 


(8.401) 


The commutator [a,&'] = 1 then implies that [A, A'] = 1. The Hamiltonian 
becomes 

H = ftcc 0 ((A + - f3)(A - (3) + l ~) - F 0 x 0 ((A - /3) + (A + - f3)) 

= hujo(A + A + —) — hujof3(A + A + ) + hco q(3 2 — FqXo(A + A + ) + 2FqXq(3 
We are still free to choose f3. We choose hupf3 = -F 0 x o, which gives 


, „ „ I F 2 t 2 

H = hu Jo {A + A + \)--f^ 
2 ucuq 


(8.402) 


which is just a harmonic oscillator with the energies shifted by a constant. 


Now choose the new number states | N) where A^A\N) = N\N) as before. We 
then have 


H\N) = (hu 0 (A+A + b-^)\N) 

2 hujQ 

= ( hw 0 (N + l) - ^) | N) = E n |TV) 
2 Iiujq 


(8.403) 


or we get the same set of energy levels as before, except that they are all dis¬ 
placed by the same constant amount 

F 2 t 2 
r o x o 

hut o 

The new ground state is given by 

A|7V = 0) = (a + /3)|7V = 0)=0 


a | JV = 0) = -/5 |JV = 0) = ^A\N= 0 ) 

hijJn 


(8.404) 

(8.405) 
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The new ground state is just an eigenstate of a with eigenvalue 


F 0 x 0 

a ' hco o 

i.e., the new ground state is a coherent state. These states are called coherent 
states because of their properties when they appear in the quantum mechanical 
theory of the laser. We will discuss these states in many other contexts later. 

Now let us get this state another way that illustrates some new powerful tools. 


Using the Translation Operator 

In general, a displaced state |A) is given in terms of the displacement operator 
(in one dimension) by 

|A) = e“^ A |0) (8.406) 


For the harmonic oscillator system 




(8.407) 


If we choose |0) to be the ground state of the oscillator, then we have for the 
corresponding displaced ground-state 


|A> = ' 


(a + -a)X 


|o> 


(8.408) 


By GlauberOs theorem (see last section of Chapter 8 for a derivation) 


e (A + B )=e A e B e - k [A,B] 


we have 


(8.409) 


(a + -d)A = 6 + A -VW dA P 5 w[“ + - a ] A2 


= e’ 




and thus 


Now 


|A) = e v^“ A e'v^“ A e <t a |0) 


(8.410) 

(8.411) 


■*»-('*(Vf4c(7I^♦■> 

= |0) (8.412) 
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where we used a|0) = 0. Similarly, using (a^)" |0) = \fn\ \n) we have 




•)|o> 


o 

' 2 >-- 

= £ ( ^ A) "|n, 
n= o vn! 

(8.413) 

or 

oo ( fmuT 

| A ) = e-i^ 2 E (N/ ^ } In) 

n= o vn! 

(8.414) 

Thus, 

OO 

l-M = E 6 «|n> 

n= 0 

(8.415) 

where 

N n 

e 2 jv 2 JM mcu 2 

(8.416) 


n ~ Ad. ’ 2 “ 4 h X 

or 




P n = probability of find the system in the state |n) 

-N N n 

= K\ 2 = (8.417) 

n\ 

which is a Poisson distribution. We, thus, obtain the coherent states once again. 


8.7. Green’s Functions 

In this section we will give a simple introduction to the Green’s function tech¬ 
nique for solving differential equations. We will return to this subject later on in 
more detail. The general form of the differential equations we have been solving 
is 

h? d 2 

Lij)(x) = Q(x ) where L = E + -—— and Q(x ) = V(x)ip(x) (8.418) 

2m dx 2 

Let us define a new function G(x, x') such that 

LG(x,x') = S(x - x') (8.419) 

Then the solution to the original differential equation is then given by 

oo 

tp(x) = ipo(x) + J G(x, x')Q(x') dx 1 (8.420) 
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(8.421) 


where tpo(x) is any function that satisfies the homogeneous equation 


Lipo(x) = 0 


Proof: 


Lip(x) = L'ipo(x) + L J G(x, x')Q(x')dx' 

= 0+ J~ (LG(x,x'))Q(x')dx' = J~ 5(x - x')Q(x')dx' = Q(x) 


One representation of the delta function is given by 


oo 

S(x-x')=^- f e^ x - x,) dp 
27 xh J 


(8.422) 


and if we assume that the Fourier transform of G(x,x') exists (label it 3), then 
we can write 


G(x,x')=^f e^ x - x ^(p)dp 

— oo 

Substituting into the differential equation for G(x, x') we get 

oo 

lG{X,X ' ) = ^hf CLe* p{x - x,) V{p)dp 

— oo 

- 5 S/ 

— OO 

-2 kf 

— oo 

-oo ' ' 

oo 

= S(x- x') = ^~ f e* p{x - x,) dp 
2irh J 


This says that 


and hence 


3 (p) = 


E- 2— 

2 m 




(8.423) 


(8.424) 


(8.425) 


oo oo 

*P(x) = e^ x + f JL-f etX^—^dp Q{x')dx' 

“ 2m 


(8.426) 


604 



where we have chosen 

ipo(x) = e* px (8.427) 

as the homogeneous solution for E > 0. 


We can rearrange this as the expression 


ip(x) = e* px + J 

— oo 

Example #1 - Single Delta Function 


dp e* px 
2irh E-^- 

2m 


J e-i px 'Q(x')dx' 


(8.428) 


Let us choose Q(x) = uS(x)xp(x), which is a delta function barrier at the origin. 
We then have 


ip(x) = e hPX + 


oo 

/ 


dp e» ! 


= e* px + u 


2nh E - 

oo 

.^(0) J 


2m 

dp e* px 
2irh E - 

2 m 


oo 

J e _EPX u5(x')ip(x')dx' 


(8.429) 


How do we evaluate this integral? 

There seems to be a mathematical problem since the integrand has poles when 

p = ±V2mE (8.430) 

But being physicists, we know that this problem has a sensible finite solution 
(we already solved it earlier). 


The key here is that we seemingly have a solution to the original differential 
equation with no unknown constants. However, we have not imposed any bound¬ 
ary conditions yet. 

This is a solution to a 2 nd order differential equation and any complete solution 
always requires two boundary conditions. 


In this case, the boundary conditions are replaced by a choice of the path of 
integration or a choice of how we decide to avoid the two poles. 


We must treat the integration variable p as a complex variable with valid physics 
represented along the real axis. Once we accept this idea, we can then complete 
the definition of the integral by choosing a contour of integration in the complex 
p-plane. 


This choice is equivalent to setting boundary conditions, as we shall see. 
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This is a very important lesson to learn. 

We must always remember that we are physicists dealing with real physical 
systems. The equations describing these real systems must have solutions since 
the systems exist. When we run into a mathematical dead-end, the way out of 
the dilemma is usually to invoke physical arguments. This case of the choice of 
contours is just the first example of many that we shall see. The creation of the 
delta function by Dirac was another such case. If this is to make sense, then the 
possible contour choices should be able to reproduce all of the relevant physics 
that we obtained earlier using boundary conditions. 

Let us see how. The poles are on the real axis as shown in Figure 8.12 below. 



There are eight possible contour choices which are shown in Figure 8.13 below. 
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Figure 8.13: Possible Contours 


We are looking for the contour that corresponds to the boundary conditions 
where we end up with only outgoing waves, i.e., where the integral behaves 
like a wave traveling to the right (the transmitted wave) for x > 0 and a wave 
traveling to the left (the reflected wave) for x < 0. 

This is accomplished by choosing contour #3 for x > 0 and contour #4 for 
x < 0. In each case, the contour encloses only one pole and using the method of 
residues we can easily evaluate the integral. 


We get 


/ 


dp ei px 
27T h E - if- 


m 


e ht 


ih \/2mE + p 


ih\/2mE 


, x > 0 


/ 


dp e* px 
2nh E-g- 


e h 


hP x 


ih \j2mE - p 


ih\j2mE 


, x < 0 
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The complete solution to the problem is then (using the absolute value function) 


ip(x) = ipinc( x ) + m e hP ^tp(0) where p = \/2mE 
iph 


Let x = 0 in this result and we can solve for 
iph 


V'(O) = 7 


_> ^( x ) = e iP x + _ m |M| _ e ip\ x \ 

iph-m\u\ iph-m\u\ 


We then have 


and 


'lptrans(x') — 


m m 


iph - to |it| 


S(E) = transmission amplitude = T 


m m 


and 


T(E) = transmission probability = 
which is the same result as we found earlier. 


iph - m |u| 
E 


TE , mu z 


(8.431) 

(8.432) 

(8.433) 

(8.434) 

(8.435) 


Example #2 - Double Delta Function 

Let us choose Q{x) = u(S(x + a) + 8{x - a))ip(x). We then have 


= en px + u 


2nh E - if- 

CO 

■*P(a) J 


= e* px + f — ehl, \ f e-* px 'u(5(x' +a)+6(x'-a)W(x')dx' 
J 27m E - 2— J 

, 7 dp e* p ( x+a "> 

#(-«) J 


2m 

dp 

2irh E - 

2m 


2irh E- 2 — 

2m 


Both of these integrals are evaluated using contour #3 for x > a( we get outgoing 
waves moving towards the right only from both terms). 

We then get 

= e^ px + ^ [iP(a)e* pix - a) + i/;(-a)e^ x+a) ] where p = v / 2^£ (8.436) 

If we find il>(a) and t/>(- a ) by substituting ±a, we can then evaluate the trans¬ 
mission probability as in the single delta function example. We again get the 
same result as before. 

More about Green’s functions later. 
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8.8. Charged Particles in Electromagnetic Fields 

If a charged particle moves in an electromagnetic field, the Lorentz force 

F = q^E+^xBj (8.437) 

acts on the particle, where q = charge, E - electric field, v = velocity and B = 
magnetic field. 

The electric and magnetic fields can be expressed in terms of the electromagnetic 
vector and scalar potentials 

_ i qa 

E = -V</>-— and B = V x A (8.438) 

c dt 

In order to make the transition to a quantum mechanical discussion, we must 
determine the Lagrangian and Hamiltonian for the charged particle in an elec¬ 
tromagnetic field. 

The appearance of the velocity vector v in the Lorentz force law means that we 
will need to work with a velocity-dependent potential function. If we insert the 
expressions for the electric and magnetic fields into the Lorentz force law we get 

( i a 2 i \ 

-V</>-— + -«x(\7xi) (8.439) 

c dt c ) 

Using the vector identity 

B x (V x C) = V (B ■ C) - (B ■ V) C - (C ■ V) B - C x (V x B) (8.440) 

we get 

j}x(vxi) = v(ti-i)-(i}-V)i (8.441) 

where we have used the fact that v is not an explicit function of the position 
vector f. 

Now, the total derivative relationship is given by 

dA dA dA dx dA dy dA dz 

dt dt dx dt dy dt dz dt 

dA -f 

Putting all this together we get 

- / 1 _ 1 dA\ 

F = q -V^+ -V(v-A)-- — 

\ c c dt ) 


(8.442) 


(8.443) 
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The standard Lagrange equation 


d ^ d L \ d L 

— —— I - —— = 0 where L = T - V = Lagrangian 
dt ox.; I ox-i 


(8.444) 


works for a potential V which does not depend on the velocity. However, when 
the force law takes the following form 


Fj = 


dU 

dxj dt V dx 


dU \ 

dxj) 


we can still construct a Lagrange equation of the form 

d ( dL\ dL 


dt \ dx 


-) 

Vj) 


dx. 


- 0 where L = T - U = Lagrangian 


(8.445) 


(8.446) 


We can get the Lorentz force law into this form by using a mathematical trick 

(8.447) 


dA d ( d -f \ did 


dt dt \ dv 


where we have used the fact that the potential (f> is independent of the velocity, 
to obtain 

F * ■ -1 i 94, -1*' A ) + i ( ir, ( # - '^)) (8 - 448) 

This says that we need to choose the generalized potential U as 

U = q<j) - -v ■ A (8.449) 

c 

and get the Lagrangian 

L = T -U = ^mv 2 - qcj) + -v ■ A (8.450) 

This gives the canonical momentum 

d L 

p= — = mv+-A (8.451) 

dv c 

The canonical momentum is NOT the simply the mechanical momentum but 
now includes an extra term proportional to the vector potential(the momentum 
of the field!). 


The Hamiltonian is derived from the Lagrangian using 

H=p-v-L- - (p--A\ + qip (8.452) 

2 to V c / 

This says that the simplest way of coupling the electromagnetic field to the 
motion of the particle is to replace the momentum by the quantity 

p--A (8.453) 

c 


610 



which includes the momentum associated with the electromagnetic field. 

This method of introducing the electromagnetic field into the equations of mo¬ 
tion of a particle is called minimal coupling. We will see its relationship to gauge 
invariance shortly. 


The transition to quantum mechanics is now done in the standard manner. 



= E\if(t)) = ih ^ | ip(t)) 

(8.454) 


{f\ H 10(f)) = E{r | xf{t)) = ih (r | 

(8.455) 


(r\H\ip(t)) = Eip(r,t)=ih ^ ^ 

dt 

(8.456) 

where 

H = ^ (p- ^i(r,t)) + #(r,t) 

(8.457) 

Using 




„ h 
p= -V 

i 


we then get the Schrodinger equation 
\ / ~ \ 2 


(?\ 


1 

2 m 
1 

2m 


2 m 
h 


| ip(t)) = Eip(r,t) = ih 


^p-^U(r,t)j + #(r,t) 

(yV - +qc/)(f,t ) 

(fa \ 2 

— V - -A(r,t)\ + q<t>{r, t)ip(r,t) - Eip(r,t) = ih 


d4’(r,t) 

dt 


(r | ip(t)} = Etjj(r, t) = ih d 

at 

dif(r,t) 
dt 


(8.458) 

(8.459) 

(8.460) 

(8.461) 


The Schrodinger equation for a charged particle in an electromagnetic field 
must be invariant under a gauge transformation (since Maxwell’s equations are 
so invariant), which is represented by the relations 


A A' = A + VS(r,t ) 

m j.' m 1 dS(r,t) 

<P-*<P =9 -XT- 

c at 


(8.462) 

(8.463) 


where S(r,t) is an arbitrary function. In order for the Schrodinger equation to 
be invariant it must be transformed to 


1 

2 m 


|-V - -A'(r,t)j i/>'(r,t) + q<j)' (r(r ,t) = ih 


dip'(f,t) 


dt 


(8.464) 


where the wave function is only changed by an overall phase factor, i.e., 


ip'(r,t) = 


(8.465) 
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Proof: 


1 

2 TO 

1 (h q -t q ,, 
— -V - -A--VS 
2m V i c c 


^ -V - - A'(r, t) j tp'(r, t) + q<f)'{f , t)ip'(r, t ) = ih 

(' 


) , i2- / Q OS x , is. q .. 

^e hc + (q(f> - = ih 

c at 


. u di p'{r,t) 

dt 

dipe^ 


dt 


Using the identities 
dipe 


Ss 


dt 


d'lb is. o io dS . is. o 
= —e hcb + —— 
i9t he dt 


^-V --A - -VS'j = e^ s ^-V- 

we get the original Schrodinger equation back. 


(8.466) 

(8.467) 


(8.468) 

(8.469) 


This result shows that the solutions of the gauge-transformed Schrodinger equa¬ 
tion still describe the same physical states. 

The wave functions or state vectors differ by a phase factor that depends on 
space and time and thus, the invariance is LOCAL rather than GLOBAL (a 
phase factor independent of space and time). 

It is then clear that it is NOT the canonical momentum p = -ihV (whose 
expectation value is NOT gauge invariant), but the genuine kinetic momentum 

p--A(r,t ) (8.470) 

c 

(whose expectation value IS gauge invariant), that represents a measurable 
quantity. 

In any physical system, if the momentum operator p appears, then it must 
always be replaced by 

P - -A(r , t) 
c 

if we turn on an electromagnetic field. This is the only way to guarantee gauge 
invariance in quantum mechanics. 


Quantum mechanics + electromagnetism requires this minimal coupling for 
gauge invariance to be valid. It is important to remember that local gauge 
transformations of the electromagnetic potentials requires that wave functions 
or state vectors transform via local phase changes, which means a phase factor 
that depends on space and time. 


Feynman showed that the effect of turning on a magnetic field is to multiply 
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the wave functions or probability amplitudes by the phase factor 


exp 


Y c f dt-A(r,t) 

path 


(8.471) 


The Aharonov-Bohm Effect 

The extra phase factor has a rather striking and unexpected consequence. 

Let A be independent of t and consider the possible interference between the 
motions along two different paths as shown in Figure 8.14 below. 



Figure 8.14: 2-Path Experiment 


When a magnetic field is turned on, the relative phase for the two paths (1 and 
2) is changed by a factor 


he 


I 

path 1 


di-A- J 

path2 


di-A 


he 


(8.472) 


where $ = the total magnetic flux passing through the loop formed by the two 
paths or the flux enclosed by the loop (this is just AmpereOs law). 


Now consider the experimental setup shown in Figure 8.15 below. 


The cylindrical solenoid between the two slits generates a magnetic field out 
of the page and confined to the region of the solenoid. This implies that the 
particles are traveling in regions free of any magnetic fields. 


The enclosed flux is NOT zero, however. As the flux is increased, the relative 
phase between the two paths changes as described above and the diffraction 
pattern on the screen changes. This occurs even though there are no magnetic 
fields present along the paths. 


There does exist, however, a nonzero vector potential in these regions and the 
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Figure 8.15: Aharonov-Bohm Experiment 


relative phase is changing because the vector potential is changing. 

This illustrates most powerfully that it is the electromagnetic potential, rather 
than the electromagnetic fields, as one might assume from MaxwellOs equa¬ 
tions, that are the fundamental physical quantities in quantum mechanics. This 
is called the Aharonov-Bohm effect. 

Classically, the electric and magnetic fields, E and B are the physically relevant 
quantities, since they determine the Lorentz force. In regions where the electric 
and magnetic fields are zero, a charged particle feels no force. The vector and 
scalar potentials A and 4> serve in classical physics only as auxiliary quantities. 

In quantum mechanics, the vector potential A(f) is the fundamental physical 
field. The wave function, however, always has the property that physical vari¬ 
ables and physical effects only depend on gauge-invariant quantities. 

More about this later. 


8.9. Time Evolution of Probabilities 

Now that we have solved for the eigenvectors and eigenvalues of several systems 
we will return to a discussion of the time evolution of probability amplitudes. 
Along the way we will reinforce the details about the most fundamental proce¬ 
dure in quantum mechanics. 


614 





Suppose we have some physical system that is describable by quantum mechan¬ 
ics. We would like to be able to answer the following fundamental question? 

If the system is in the state |a) at time t = 0, 
what is the probability that it will be in the 
state |/3) at time f? 

These two states are usually defined as follows: 

|a) = initial state = state prepared by experimenter 
|/3) = final state = state defined by measuring device 

The formal answer to the question is 

P/ 3 a(t) = {Amplitude p a (t)\ 2 = \Ap a (t)\ 2 = \(/3\U(t)\a)\ (8.473) 

where U ( t ) is the time evolution operator. Let us assume for simplicity at this 
stage that the Hamiltonian H does not explicitly dependent on time t. From 
our earlier discussions, we then have 

U(t) = e - i6t/h (8.474) 

so that the probability amplitude becomes 

A Pa (t) = (/3| U(t) |a) = (/?| e~i kt |a) (8.475) 

The most important rule in quantum mechanics is the following: 

If you are evaluating an amplitude involving a 
particular operator, H in this case, then switch 
to a basis for the space corresponding to the 
eigenvectors of the that operator. 

This basis is called the HOME space. 

This is carried out by inserting identity operators as sums of projection operators 
written in the energy eigenvector basis: 

A fia (t) = (l3\e-i 6t \a) = (l3\Ie-i 6t i\a) 

= I E </3 I E ) <£| \ E ') ( E ' I «> (8-476) 

E E' 

where 

H\E) = E\E) and H\E') = E'\E') 

(E | E') = See ' 

Y,\E)(E\ = i=Y i \E')(E'\ 

E E' 
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(8.477) 

(8.478) 

(8.479) 



It is clear that the time evolution calculation is most easily done in this basis. 


We then have 


Apa(t) = E£0S| E) (E\ e~* E,t I E') (E f \ a) 

E E' 

= £ £ e~^ E ' t (f3 \ E) (E \ E') (E r \ a) 

E E' 

= ££e-^‘<0|S>^ (E'\a) 

E E' 

= Y,e~ iE ' t (P\E)(E\a) 

E 

Let us now describe a typical procedure: 

1. Determine the initial state of the system = |ct) 

2. Expand the initial state in terms of the energy basis 

l a ) = £ \E") (E" | a) 

E" 

which means we know the coefficients (E" | a) 

3. Expand the final state in terms of the energy basis 

10) = £ \E") (E" | /3) 

E" 

which means we know the coefficients ( E" \ (3) 

4. Calculate the amplitude and probability. 

Example - Infinite Square Well System 


Suppose that a particle in an infinite square well [-a/2,+a/2] is 
state |a), such that 


{x\a) = 


0 x < 0 

4= Q<x<^ 

V CL 4 

0 x>\ 


Now we have found that 


(x\E) 


2 cog n odd 

\fa a 

2 gj n nTTX n eyen 

Ja a 


where 


„2 212 
nn n 

2 ma 2 


n= 1,2,3,4,... 


(8.480) 


initially in the 

(8.481) 

(8.482) 

(8.483) 
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Therefore, we get 


2 2 4 
(E n | a) = y (£„ | x) (x | a) dx = — J (E n \ x) 


dx 


2\/2 |sin n odd 


(8.484) 


|a) = (Ei | a) | Ei) + (E 2 \ a) \E 2 ) + (E 3 \ a) \E 3 ) + ... 

= - |^) + T |£ 3 > + ^ |£?4> + T |£?5> - T |^7> - ^ |£?8> + ■ ■ ■ (8.485) 

7T o7T 47r 07T 77T 87T 

If we assume that the final state |/3) is an energy eigenstate, say \E n ), we then 
have the probability amplitude that the particle started out in state |a) at t = 0 
is found in the state \E n ) at time t is 


A na (t) = Y, e~* Et (E n \E)(E\a} = (E n \ a) 

E 

and the corresponding probabilities are 

P na (t) = \(E n \a)\ 2 


(8.486) 


(8.487) 


which is independent of time. A plot of these fixed probabilities versus n is 
shown in Figure 8.16 below. 



Figure 8.16: Probabilities 
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If, on the other hand, we assume that the final state is a linear combination of 
two energy eigenstates, say, 




(8.488) 


then we have the probability amplitude that the particle started out in state |a) 
at t = 0 is found in the state 1/3) at time t is 


A3oW = E e " KSt </3|£>(£|a) 


= -2= (E! | a) + e~* E3t {E 3 | a)) 


n/2 


and the corresponding probability is 

Pp a (t) = |</3 | a)\ 2 = \ (E 1 \ a) + e~^ 3t (E 3 | a)f 


1 

e ' hElt - +e r> E3t — 

2 _ 2 

I + e *(^3-Ei )t 

2 

q 

CO 

7T 2 

3 


7T ’ 
20 
97r 2 


= — !- + - cos 


— (^3 - Ei)t 
h 


♦o 


^1+ ^ cos ^(E 3 -Ei)t. j 


(8.489) 


(8.490) 


which is dependent on time. In this case the probability of observing this final 
state oscillates between 0.18 and 0.72. 


8.10. Numerical Techniques 


We now introduce a numerical scheme for solving the 1-dimensional Schrodinger 
equation. As we have seen the 1-dimensional Schrodinger equations takes the 
form 

h“ d 2 ib(x) 

+ V(x)ip(x) = Eip(x) (8.491) 

2m dx z 

We will consider systems with the boundary conditions 


and 


dx 


are continuous and finite for all x 


lim x) = 0 
£—►±00 


oo 

f kHaOr 


dx < oo or the wave function is normalizable 


(8.492) 

(8.493) 

(8.494) 


618 



Let us illustrate the procedure with the harmonic oscillator potential (so we can 
check our results against the exact answer). The potential energy function is 


V(x) = x 2 (8.495) 

The Schrodinger equation becomes 

(Ptp(x) 2m ( 1 2 2 r~i\ / / \ to 

-^ s i*\2 mwx ~ E r M (8496) 

For real systems the very large or very small numerical values of m, lo, and h 
can lead to calculational difficulties. For calculational simplicity we will choose 

h = \pl and m = uj = 1 (8.497) 


Since hui has units of energy, when the calculation is finished we can convert 
our numerical energy values back to real physical units by the operation 


Ereal ~ Enumerical ,— 

y/2 


We now have the equation 


d 2 i/j(x) 

dx 2 


(0.5x 2 - E) ip(x) 


to solve for the allowed E values. 


(8.498) 


(8.499) 


This potential energy function satisfies V(x) = V(-x), so that the solutions will 
have definite parity or they will be even = ijj(-x)) or odd (^>( x ) = —ip(—x)). 

We arbitrarily choose solutions of the form 

even i/j( 0) = 1 and d ^°- ) = 0 
odd ^>(0) = 0 and = 1 

The particular numerical values chosen are arbitrary since the wave functions 
we find have to be normalized anyway. These choices correspond to choosing 
wave function that look like those in Figure 8.17 below 




H'(x) 



/ 


/ 

x 


Even Odd 


Figure 8.17: Initial Values 
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in order to satisfy this parity property. 


The solution method then follows these steps: 


1 . choose 


■ 0 ( 0 ) and 


dip( 0 ) 
dx 


(choose an even or odd solution type) 


2 . pick a starting energy value 

3. with the chosen boundary conditions and energy value, break the second- 
order differential equation into two first-order differential equations and 
use a Runge-Kutta method to solve for the values of ip(x) for x > 0, i.e., 
if we let 

dil) dy d 2 ip 

^ dx dx dx 2 

then we get the two coupled first-order differential equations 

^ = (0.5a : 2 -E)1> and y 
dx dx 


4. the fourth-order Runge-Kutta equations corresponding to these equations 
are then 


Xk +1 — Xk + h 

V’fe+i = V’fe + 77 (/1 + 2/2 + 2/3 + /.j) 

6 

Vk+i - Uk + (51 + 2(?2 + 253 + 54 ) 

0 

where 

/( x, ip, y,E) = y and g(x, ip, y, E ) = (0.5a : 2 - E)ip (8.500) 

and 

fi = . f(x k ,ipk,Vk,E ) 
h = f(xk + ^,ipk + T^fiiVk + ^ 91 , E) 

h = f(xk + ^,tpk + ^/2,2/fc + ^ 92 ,E ) 

fi = f(xk + h, ipk + hf 3 , y k + hg 3 , E ) 

9i = g(x k ,ipk,yk,E) 

52 = g( x k + ^,ipk + ^h,yk + ^ 91 ,E) 
h h h 

93 = g{xk + -,ip k + 2 / 2 , 2 /fc + 2 -92^) 

54 = g(x k + h, ipk + hf 3 ,y k + hg 3 , E ) 
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5. an allowed energy value (an eigenvalue of the Hamiltonian) occurs when 
the numerical solution approaches zero (exponential decrease) for large 
values of x 

6 . since it is not possible to find the energy eigenvalues exactly (due to com¬ 
puter roundoff errors) the numerical solution will always diverge (get very 
large (either positive or negative)) at large values of x 

7. the solution method can only home in on the energy eigenvalue 
The program code below carries out the following process: 

1 . choose an energy eigenvalue lower than the desired value 

2 . choose an energy step value 

3. the solution will eventually diverge (say it gets large positively) 

4. increase the energy value by the energy step 

5. the solution will eventually diverge (say it still gets large positively) 

6 . increase the energy value by the energy step 

7. the solution will eventually diverge negatively 

8 . an energy eigenvalue is now located between this energy value (where it 
diverged negatively) and the last energy value (where it diverged posi¬ 
tively) 

9. change the energy value back to the previous value (where it diverged 
positively) and decrease the energy step by a factor of 10 

10 . repeat the process until it diverges negatively again 

11 . change the energy value back to the previous value (where it diverged 
positively) and decrease the energy step by a factor of 10 

12 . continue until the desired accuracy is reached 

Sample programs (written in the MATLAB language) 

function z=sq(x,psi,y,alpha) 
if (x < 1) 

z=-15*(1-alpha)*psi; 
else 

z=15*alpha*psi; 
end 

°/ 0 energy 0 < alpha < 0; search for eigenvalues 

7, choose E lower than desired eigenvalue; choose estep a reasonable size 
7, and to move you in direction of an eigenvalue; program will home in 
7« and stop when estep/E<.0001; uses 4th order Runge-Kutta method 
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format long 

estep=-0.05;alpha=0.99;h=0.05;n=0;Eold=10~25; 
while 1 

n=n+l;psi=l;y=0; 
for j=0:1000 

fi=y; 

gl=sq(j*h,psi,y,alpha);f2=y+h*gl/2; 

g2=sq((j+0.5)*h,psi+h*f1/2,y+h*gl,alpha);f3=y+h*g2/2; 
g3=sq((j +0.5)*h,psi+h*f2/2,y+h*g2/2,alpha);f4=y+h*g3; 
g4=sq((j +1)*h,psi+h*f3,y+h*g3,alpha); 
psi=psi+h*(fl+2*f2+2*f3+f4)/6; 
if (abs(psi) > 100) 
if (n == 1) 

check=sign(psi); alpha=alpha+estep; 
else 

if (check ~= sign(psi)) 
alpha=alpha-estep; estep = estep/10; 
else 

alpha=alpha+estep; 
end 
end 
break; 
end 

y=y+h*(gl+2*g2+2*g3+g4)/6; 
end 

stp=(abs(estep/alpha)); 

[alpha,stp] 
if (stp < 0.0001) 
break; 
end 
end 

format short 


A typical run looks like the table shown below. 
Using the conversion rule the energy eigenvalue is 


E, 


/ \/ 2 \ hid 1 

^/7i = 2 


hid 


(8.501) 


which is exactly correct for the lowest even state eigenvalue (the ground state). 
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E - V alue 

StepSize 

0.150000 

0.333333 

0.200000 

0.250000 

0.250000 

0.200000 

0.300000 

0.166667 

0.350000 

0.142857 

0.400000 

0.125000 

0.450000 

0.111111 

0.500000 

0.100000 

0.550000 

0.0909091 

0.600000 

0.0833333 

0.650000 

0.0769231 

0.700000 

0.0714286 

0.750000 

0.0666667 

0.700000 

0.0071428 

0.705000 

0.0070922 

0.710000 

0.0070422 

0.705000 

0.0007092 

0.705500 

0.0007087 

0.706000 

0.0007082 

0.706500 

0.0007077 

0.707000 

0.0007072 

0.707500 

0.0007067 

0.707000 

0.0000707 


Table 8.1: Run Values 


8.11. Translation Invariant Potential - Bands 

We now consider a potential that is periodic in space , which means that it is 
translationally invariant, i.e., 

V{x) = V{x + a) where a = some fixed distance (8.502) 

This might represent an electron in a 1-dimensional crystal lattice, where a is 
the distance between atoms. The change in the allowed energy spectrum caused 
by the periodicity of the lattice is rather striking. 

The Hamiltonian of the system is 

„ p 2 

H(x,p) = — + V(x) (8.503) 

2 m 

This Hamiltonian is invariant if we translate a fixed distance a 

2 2 

H{x + a,p) = + V{x + a) = + V{x) = H(x,p) (8.504) 
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This means that the transformation operator T(a ) that corresponds to a trans¬ 
lation by a distance a commutes with the Hamiltonian 

[H,T(a)]= 0 (8.505) 

and thus they must share a set of eigenfunctions. 

From our earlier discussion, the translation operator has the defining property 
that, if the state | ip) has the wave function (x \ if) in the position representation, 
then the state T(a ) | if) has the wave function 

(x\T(a)\if) = (x + a\ if) (8.506) 

Let us rederive the eigenfunctions of T(a). If | if) is an eigenstate of the T(a ) 
operator, then we must have 

T(a ) | if) = A | if) where A = the eigenvalue (8.507) 


and thus 

if(x + a) = (x\ T(a ) | if) = X(x\if) = Xif(x) (8.508) 

With hindsight, we write A in the form e lka which defines k. We then have 


if(x + a) = e lka if(x) 

(8.509) 

If we define 


u k (x) = e~ ikx if{x) 

(8.510) 

we then have that 


/ /_ . \ Ak(x+a) „ / , \ Aka „/./ Akaikx n . / \ 

, ip(x + a) = e v Uk(x + a) = e xp{x) = e e u^yx) 

(8.511) 

u k (x + a) = U k (x) 

(8.512) 


which says that u k (x) is also periodic in space with period a. The eigenfunctions 
of T(a) are then given by 

if{x) - e lkx u k {x) (8.513) 

which is a plane wave modulated by a function with the periodicity of the lattice 
(potential). 

This type of wave function is called a Bloch wave. 

We note that k must be real since if k were not real, then the eigenfunction 
would diverge as x gets large and thus be unrenormalizable, which is not al¬ 
lowed. 

Since H and T(a ) must share a set of eigenfunctions, we can construct a com¬ 
plete set of energy eigenstates in the form of Bloch waves. This means that we 
can find at least one energy eigenstate for each possible A = e lka , since all A are 
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eigenvalues of T(a). 


Note also that the values k and k + 27rn/a give the same eigenvalue and thus we 
can restrict our attention to k values in the interval 

<k<— (8.514) 

a a 

and we will be including all possible eigenvalues of T(a). 

Example 

Let us now consider the specific example of a 1-dimensional periodic array of 
delta functions 

oo 

V(x) = X! voS(x-na) (8.515) 

n=-oo 

which corresponds to the so-called Kronig-Penney model. 

In the interval 0 < x < a the potential vanishes and the solution to the Schrodinger 
equation is of the form 

i2 2 

Mx) = Ae lqx + Be~ iqx where E = — (8.516) 

2m 

Therefore, in the interval 0 < x < a, we must have 

«*(*) = Ae i{q ~ k)x + Be~ i(q+k)x (8.517) 

The coefficients A and B are fully determined by two conditions: 

1. ip(x) and hence Uk(x ) must be continuous at the lattices sites 

lim (u k (r]) ~ Uk(~il)) -*■ 0 

ri~* 0 

Periodicity then says that Uk(-r)) = Uk(ci-rj) and hence we must also have 
lim(u fc (? 7 ) -u k (a-ii)) -» 0 

which gives 

A + B = Ae ,{q ~ k)a + Be~ l{q+k)a (condition 1) (8.518) 


2. dip{x)ldx is discontinuous at the lattice points, as we showed earlier, as 
follows: 



dip(x) 


dx 


= ^o^(O) 
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(8.519) 



We have 


lim 

77 ->-0 


lim 

rj -*-0 ' 


dip(x) 


dx 

dip(x) 


= iq(A - B ) 


x=V/ 


dx 


-ika 


lim 

t;— 0 I 


dij)(x) 


dx 


-ika 


iq(Ae iqa - Be~ iqa ) 


i/j(0 ) = A+B 


which gives 


——iq (A—B — Ae i{q ~ k)a + Be~ i(q+k)a ) 
2 to ' ' 


= Vq(A + B) (condition 2) 


(8.520) 


We only need these two conditions and do not need to worry about all of the 
rest of the potential because of the periodicity. 

We solve for the ratio A/B in each equation and equate the two results to get 
a transcendental equation for q in terms of k (which then gives us the allowed 
energy values). We get the equation 


, rnv 0 . 

cos ka = cos qa + —— sin qa 
qh A 

Now for any given value of k we must have 

-1 < cos ka < +1 

Therefore, if we plot 

mv^a sin qa 

cos ka = cos qa + - -versus qa 


h 2 


qa 


(8.521) 

(8.522) 

(8.523) 


the only allowed q values are between the two lines representing ±1 as shown in 
Figure 8.18 below. 

i.e., q must always lie in the unshaded areas. For each value of cos ka there are 
an infinite number of solutions. Taking k in the range -ir/a < k < n/a , we can 
plot 

H 2 q 2 


E = 


as shown in Figure 8.19 below. 


2 m 


versus ka 


The possible energy values lie in bands, with gaps between them. This structure 
of the energy spectrum with bands and gaps occurs for all periodic potentials. 

It is the basis of the explanation of energy bands in solids, which we will discuss 
in detail later. 
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Figure 8.18: Allowed q values 



Figure 8.19: Bands and Gaps 


8.12. Closed Commutator Algebras and Glauber’s 
Theorem 

The earlier example of the solution of the harmonic oscillator potential illustrates 
a general rule about algebraic solutions. 

In that example we had the following commutator algebra 

[a, a + ] = I , [cl,N] = a, [a + ,N] =-a + (8.524) 

In this algebra, all the commutators only involve a closed set of operators. In 
this case the set is {a, a ', and IV}. 

When the algebra is closed, we call it a closed commutator algebra. 

If we have a closed commutator algebra, then, in general, we can completely 
solve the eigenvalue/eigenvector problem for these operators using algebraic 
methods. 

Some examples are the harmonic oscillator, systems that can be factored, and, 
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as we shall show later, angular momentum operators, and the hydrogen atom 
system. 

A very useful algebraic result is Glauber’s Theorem, which we now prove. 


For two operators A and B assume that 

[A,[A,b]] = [b,[A,b]] = o 

Now let a: be a parameter and consider the operator 


f(x) = e Ax e &x 


(8.525) 

(8.526) 


We then get 

^ = Ae Ax e 6x + e Ax Be &x = Ae Ax e &x + e Ax Be~ Ax e Ax e &x 
dx 

= (i + e Ax Be~ Ax \ e Ax e hx = (i + e Ax Be~ Ax ) f(x) 


Our assumption (8.526) implies that (proof later) 


[B,A n ] =nA n ~ 1 [B,A] 

(8.527) 

and 


n 


rf.'fl 


= -e~ Ax [B,A\x 

(8.528) 

which gives 


[B, e- ia: ] = Be~ Ax - e~ Ax B = -e~ Ax [ B , i] x 

(8.529) 

e Ax Be~ Ax =B-[B,A]x 

(8.530) 

Thus, we have 


f r = (A + e Ax Be~ Ax ) f(x) = (A + B + [i, B\ x) f{x) 

(8.531) 

Now, our assumption (8.526) says that 


[(A + b),[A,b]\ = o 

(8.532) 


This means that we can treat them like numbers to solve the above differential 
equation, i.e., we do not have to be careful above operator order. The solution 


is 


f(x) = e Ax e §x = e ( A + B)x e i[A,B]x 2 


(8.533) 
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Letting i = lwe have Baker-Hausdorf identity (Glauber’s theorem) 

e (A+B) = e A e B e -\[A,B] (8.534) 

This leaves us to prove (8.528). Consider some special cases to see how it works. 
n = 1 [B,A] = nA n ~ l [B,A] 

n = 2 [B, i 2 ] = BA 2 - A 2 B = [[B, A], A] + 2ABA + 2A 2 B 

= 2ABA + 2 A 2 B = 2 A [B, A] = ni”" 1 [B, i] 

n = 3 [B,A 3 ]= BA 3 -A 3 B = bA 2 A-AA 2 b 

= [[B, a 2 ] + A 2 b] A-A [ bA 2 - [B, a 2 ]] 

= 2A [B, A] i + A 2 bA - iBi 2 + 2i 2 [B, i] 

= 4i 2 [B, i] + i 2 Bi - i 3 B - 2i 2 [B, i] 

= 2i 2 [B, i] + i 2 [B, i] = 3i 2 [B, i] = ?ri n_1 [B, i] 

This suggests a general proof by induction. Assume that 

[B,A n - 1 ] = (n-l)A n ~ 2 [B,A] 

is true. Then, we have 

[B, A n ] = BA n - A n B = BA n ~ 1 A - AA n - 3 B 

= [A n ~ l B + (n - 1 )A n ~ 2 [B, A]] A-A [BA" _1 - (n - l)i” -2 [B, i]] 

= 2(n - 1 )i”“ 1 [B, i] + A n ~ x BA - AbA ” _1 
= 2(n - l)i” _1 [B, i] + - i [. A n ~ l B + (n - 1 )A n ~ 2 [B, A]] 

= (n - 1) A"” 1 [B, A] + A n_1 [B, A] = nA"’ 1 [B, A] 

Done! 


8.13. Quantum Interference with Slits 

In the experiments considered here, we measure the y-component of momentum 
for a particle passing through a system of slits. The source-slit system is the 
preparation apparatus that determines the state vector. Recognizing that a 
system of slits is a position-measuring device allows us to ascertain that the 
state vector is a position state. Then, writing the state vector in momentum 
space provides a straightforward calculation for the probability amplitude and 
its corresponding probability function. Interference effects, if any, are inherent in 
the probability function. We determine the statistical distribution of scattered 
particles for four different slit systems. The results are in agreement with the 
well-known interference patterns obtained in classical wave optics. 
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8.13.1. Introduction 

The double-slit experiment is the archetypical system used to demonstrate quan¬ 
tum mechanical behavior. It is said by Feynman “to contain the only mystery” 
in all of quantum mechanics. Numerous textbooks and journal articles discuss 
slit interference, usually in conjunction with wave-particle duality. Most authors 
emphasize that classical physics cannot describe the double slit experiment with 
particles. Yet, bolstered by the de Broglie hypothesis, they still ascribe to the 
classical maxim, “Waves exhibit interference. Particles do not”. They then con¬ 
clude that, “When particles exhibit interference, they are behaving like waves”. 
Then the subsequent analysis is simply wave theory, and any interference effects 
are made to agree with Young’s experiment. 

Thus, classical wave optics, rather than quantum mechanics, is used to explain 
quantum interference. For example, Ohanian states “ ... the maxima of this 
interference pattern are given by a formula familiar from wave optics”. Some 
authors do suggest that a quantum mechanical approach is lacking. Liboff tells 
us, “The first thing to do is to solve Schrodinger’s equation and calculate \ip\ 2 
at the screen”. Ballentine makes a similar statement when discussing diffraction 
from a periodic array. “. solve the Schrodinger equation with boundary con¬ 

ditions corresponding to an incident beam from a certain direction, and hence 
determine the position probability density l'k(i)! 2 at the detectors’”. But he 
then says, “an exact solution of this equation would be very difficult to obtain, 

. ”. The difficulty according to Merzbacher (5) is that, “A careful analysis of 

the interference experiment would require detailed consideration of the bound¬ 
ary conditions at the slits”. 

In spite of these misgivings, quantum mechanics does provide a straightforward 
calculation for the probability distribution of the scattered particles. 

Quantum mechanics is a theory about observations and their measurement. Its 
postulates provide, among other things, a set of instructions for calculating the 
probability of obtaining a particular result when an observable is measured. 
These probability calculations require a state vector \tp), which is determined 
by the preparation procedure. Its representation is dictated by the observable 
being measured 

\ip) = Yj\ a k)( a k I V 1 } (8.535) 

k 

The basis vectors |a*,) are the eigenvectors of the measured observable A. Having 
obtained the state vector |?/;), the probability that a measurement of observable 
A yields the value a*, is given by the Born postulate 

Pk = |<o fc | V>}| 2 (8.536) 

The state vector | ip) and the probability distribution | (a*, \ ip) | 2 are unique for a 
given experiment. State preparation and measurement are discussed at length 
in Chapter 15. 
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We expect, then, that a quantum mechanical description of a slit experiment 
will 

1 . clearly define which observable is being measured, 

2 . describe the preparation procedure that determines the state vector 

3. yield the probability function for the scattered particles. 

8.13.2. Probability functions and quantum interference 

The experiment considered here consists of the apparatus shown in Figure 8.20 
below. For such an experiment, the source-slit system, which determines the 
possible y-coordinate(s) of the particle at the slits, is the preparation apparatus. 



Figure 8.20: Particle Scattering from Slits 


A particle originating at the source is scattered at angle 8 by the system of 
slits. Thus, a particle passing through the slits has undergone a unique state 
preparation that determines the probability of scattering at angle 8 . 

The state vector is a position state. Because position and momentum are non¬ 
commuting observables, a particle passing through slits always has an uncer¬ 
tainty in its y-component of momentum. It can be scattered with any one of 
the continuum of momentum eigenvalues p y = psin0, where -7 t/2 < 8 < 7 t/2 . 
Measurement of a well-defined scattering angle 8 constitutes a measurement of 
the observable p y and, therefore, the basis vectors in Hilbert space are the mo¬ 
mentum eigenvectors | p y ). 
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The probability that a particle leaves the slit apparatus with momentum p y is, 
then, 

Pp v = I (Py I n 2 (8.537) 

It is this probability function that exhibits quantum interference. Its maxima 
and minima, if any, correspond to constructive and destructive interference re¬ 
spectively. 

In the position representation, the free-particle momentum eigenfunction corre¬ 
sponding to the eigenvalue p y is 

(y\p y ) = ^=e*vv/» (8.538) 

and the probability amplitude for scattering with momentum p y is 

oo oo 

(py \ip) = f (p y \ y) (y I i’)dy = wJ e - ip *v/ h i/)(y)dy (8.539) 

— oo —oo 

An examination of the corresponding probability function 

Pp v ~ \{Py I ( /’>| 2 

will determine whether or not there is interference. 

In the following discussion, we evaluate the integral of amplitude equation by 
first constructing the position state function 

Hv) = (y I V>) (8.540) 

We do this for four source-slit systems, including the double slit. 

8.13.3. Scattering from a narrow slit 

A narrow slit of infinitesimal width is an ideal measuring device; it determines 
the position with infinite resolution. A particle emerging from a slit at y = y\ is 
in the position eigenstate \yi). In the position representation, the eigenfunction 
of position is the Dirac delta function 

= (y \ yi) = Hy - yi) ( 8 . 541 ) 

and the probability amplitude for a particle emerging from the slit with mo¬ 
mentum p y is 


(Py I V’) 


OO 

J_ [ e -i Pv y/*S(y- yi )d y 

v27 t j 

— 00 


1 r -jpyyi/h 


(8.542) 
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The corresponding probability function is 


P (Py) = I (Py 


p -iPyyi/h 


n/2^ 


= constant 


(8.543) 


It is equally probable that the particle is scattered at any angle. There is no 
interference. 


8.13.4. Scattering from a double slit(narrow) 


We again assume that the slits are infinitesimally thin. For such a double slit 
apparatus, the observable y has two eigenvalues, y i and y 2 . Assuming the 
source-slit geometry does not favor one slit is over the other, the state vector is 
the superposition of position eigenvectors 

W) = A=(\ y i) + \ y 2)) (8.544) 

and 

i’iy) = (8 (y -yi) + 8(y- y 2 )) (8.545) 

Here, the amplitude for finding the particle with momentum p y is 


(Py I V>) = 


1 1 
\/27r \J~2 

27^ (e 


OO CO 

f e ip yy' h 5 (y- yi )dy+ f e~ ip ^ h 6 (y - y 2 ) dy 


-ipyVx/h + e ~ip y y 2 


2 /hj 


(8.546) 


From which we get the probability function 


p (Py ) = I (Py I V'll = 


1 


p-iPyVlIh 


2\A 


+ e 


- ip V V2/h ) 


(2 + Jpyiv&vz'l/b + e -Wy(yi-V2)/h\ 

4tt v ’ 


- — (1 + cos 

2tt V 


( 7 )) 


(8.547) 


where d = y\-y 2 is the distance between the slits. We see that this probability 
function does have relative maxima and minima and quantum interference does 
occur. Using p y = psin@ we obtain the angular distribution of scattered particles 


P{9) 


1 

27T 


1 + cos 


(pdsindy 

\ h j. 


(8.548) 


The distance between the slits determines the number of interference fringes. In 
this example the distance between the slits is d = 4A, where A is the de Broglie 
wavelength. If we define (f> = pdsm6/h and use the half-angle formula 


1 + cos (p = 2 cos 2 



(8.549) 
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the probability function takes the form 

P(^cos 2 (|) (8.550) 

This is the familiar intensity distribution for Fraunhofer diffraction. A plot of 
(8.549) is shown in Figure 8.21. 


ANGULAR DISTRIBUTION l\0) 
OF SCATTERED PARTICLES 



Figure 8.21: Angular Distribution of scattered particles from double narrow slits 


8.13.5. Scattering from a slit of finite width 


A slit of finite width is an imperfect apparatus for measuring position. It cannot 
distinguish between different position eigenvalues and a particle emerging from 
a slit of width a can have any value of observable y in the continuum -a/2 < y < 
a/2. Assuming an equal probability of passing through the slit at any point, a 
particle at the slit is in the superposition state 




elsewhere 


Here, the probability amplitude is 


(Py I i>) = 


\j2na 
ih 

p y \/2TTa 
2 h 

p y \/2na 


a/2 

f e~ ip * v/h dy 

-a/2 


^ e -ip v a/2h _ e ip v a/2h^ 


sin(ap y /2h ) 


(8.551) 


(8.552) 
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and the corresponding probability function is 


9^2 

P(p y ) = | (p v | ip)\ 2 =- ~ sin 2 (ap y /2h) (8.553) 

Trap y 

This result is shown below in Figure 8.22 in terms of the scattering angle 6 . 



SCAI ItRMU ANUtt « I UADI ASX ( 


Figure 8.22: Angular Distribution of scattered particles from single slit of finite 
width 


Once again, the distance between the slits determines the number of interference 
fringes. In this example the distance between the slits is d = 4A, where A is the 
deBroglie wavelength. 

We see that P(p y ) is the well-known diffraction pattern for light if we define 
a = ap y /2h = apsind/2h and write 


P(a) = 



(8.554) 


8.13.6. Scattering from a double finite-width slit 

As a final example, we consider a particle passing through a double-slit appa¬ 
ratus consisting of two slits each of finite width a. This is the most realistic 
description of the double slit experiment. Here, the state vector is 

IV’) = ~7= (I2/1} + I2/2)) ( 8 . 555 ) 


where 




1 IVa yi-a/2 <y<y 1 +a/2 
0 elsewhere 


(8.556) 


635 



and 


(V l^2> = 


|l t\/a V 2 - a/2 < y < y 2 + a/2 
10 elsewhere 


Again, we calculate the probability amplitude 

1 

71 


(Py I V>) = (( Py \l/>l) + {Py\ 2 » 


1 


\Z2ttcl 
ih 

"p y \J2na 


Vl+a/2 

V 2 +a/2 

f e- ip y y/h dy + 

f e~ ip y y/h dy 

J 

j/ 1 - 0/2 

J 

V2-a/2 


[ e -ip v (yi+°-/ 2 ) h _ e ~Wy(yi-a/2)h 


+ e 


-ip y (y 2 +a/ 2 )h _ e -ipy(y 2 -a/2)h^ 


With a slight manipulation of terms this amplitude becomes 

2 h 


(Py I V>) = 


and the corresponding probability distribution is 


4h 


P(Py) = -- (1 + cos (p y d/h)) sin 2 (ap y /2h) 

Kapy 


(8.557) 


(8.558) 


- e -i Pv yih _ e -i PyV2 h ] s[n ( apy / 2 h) (8.559) 


(8.560) 


The angular distribution P(9) is shown in Figure 8.23 and Figure 8.24 below 
for two different slit configurations. 


ANOl! AR DISTRIBUTION f(»l 
Of SCATTERED PARTICLES 



SCATTERING ANGIE » (RADIANS) 


Figure 8.23: Angular Distribution of scattered particles from two slits of finite 
width The slit width is A = a 
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Once again, the distance between the slits determines the number of interference 
fringes. In this example the distance between the slits is d = 4A, where A is the 
de Broglie wavelength. 


ANGULAR DtSTUBL’TKJN /W 
Of SCATTERED PARTICLES 


A 
/ \ 

1 \ 


A 
/ \ 

/ \ A 

—J . Va/V 

11 

A/J V 
\y\J. \ — 


i 'I * 5 T 


SCATTERING ANGLE » (RAMANS) 


Figure 8.24: Angular Distribution of scattered particles from two slits of finite 
width The slit width is A = a/2 

Using </> = pdsind/2h and a = apsmd/2h, we again get the optical form. 

P(<p) = — cos 2 (^/2)(^^) (8.561) 

7r \ a ) 

8.13.7. Conclusions 

The Born postulate has been used to obtain the interference pattern for particles 
scattered from a system of slits without referring, a priori, to classical wave 
theory. Having identified the state vector as a position state and the measured 
observable the momentum, we obtain explicit expressions for the state vector 
|0) and its corresponding probability function 

p (Pv) = \{Py I ^>| 2 

The results are in agreement with wave optics. 

Quantum interference can occur only when a large number of identically pre¬ 
pared particles are observed. These particles are detected at different locations, 
one at a time. A single particle is always detected as a localized entity and no 
wave properties can be discerned from it. 

It is interesting that for particles scattered from a double slit, the probability 
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amplitude that gives rise to the interference is due to a superposition of delta 
functions. 


8.14. Algebraic Methods - Supersymmetric Quan¬ 
tum Mechanics 

8.14.1. Generalized Ladder Operators 

Earlier in this chapter, we used the ladder or raising/lowering operators and 
their commutator algebra to carry out an algebraic solution for the harmonic 
oscillator system. 

Can we generalize this procedure to other Hamiltonians? 

Let us consider the general Hamiltonian Hq 

H° = -^+h ) (x) (8.562) 


where we have set h = 1. 


For algebraic simplicity, we choose the potential energy term Vo(a;) to be 


V 0 (x) = V(x)-E 0 


(8.563) 


where V ( x ) = actual potential energy and E 0 = ground-state energy, so that the 
ground state energy is zero (this is just a choice of reference level for the poten¬ 
tial energy). Suppose that the ground state(zero energy now) wave function is 
represented by iPq(x). We then have 


which gives 


or 


and 


^ ld 2 ip 0 (x ) 

1 d 2 ip 0 (x ) 

2 dx 2 
= 1 d 2 ip 0 (x) 

2 dx 2 


+ V 0 (x)ip 0 (x) 

+ (V(x) - E 0 )i> 0 (x) 


+ V(x)il> 0 (x) -E 0 ip 0 (x) = 0 


H 0 ip o(ar) = --^o + Wo = 0 

° 2 V’o 


H a = - 


dx 2 ip o 


(8.564) 


(8.565) 

(8.566) 

(8.567) 
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Let us now introduce the new operators(a generalization of the 
lowering operators) 


=F-^- + W(x) 
ax 

where 

W{x) = - ^ 

Wo 

is called the superpotential for the problem. 



We then have 


2a ± a F 




1 

d 

j_ 


dx ip o 

n/2 

±- 

dx 

_ 

V’o. 



To determine a more useful form for the quantity 


a = 



d_ 

dx 


we let it operate on an arbitrary function f(x). We get 



We then get 


2a ± o F 


dx 2 


+ ex, + 



2 


dx 2 



dx 2 i/jq 


+ (lTl) 



raising and 

(8.568) 

(8.569) 

2 

(8.570) 

(8.571) 

(8.572) 

(8.573) 

(8.574) 
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If we define two new quantities by 


Vi(x) = V 0 (x)-±$ , H^-^ + V^x) ( 8 . 575 ) 

(LJL (jyQ Z (IJb 

then we have 

-♦ a -__1 d* 1K_I 

a a 2 dx 2 2 V>o 2 

and 

ra+ = _I^_M + W 2 

2 da; 2 2 V>o \ / 

= _I77_y + <_7_^> 

2 da; 2 ° ifo dx ipo 

= A^ +V 

2 da; 2 ° da; ?/>o 

< 8 - 577) 

Idi is called the supersymmetric (SUSY) partner of Ido- Vi(a;) and Vo(a;) are 
called the supersymmetric partner potentials. 


dx 2 i/jq 


= Id n 


( 8 . 576 ) 


Now consider the operator 

P = (8-578) 

dx 

(remember -i(3 = p = linear momentum operator) What is /3 + ? We could figure 
this out from the momentum operator, but it is instructive to do it from scratch. 
The formal definition of /3 + is given by the hermiticity condition 

J {P + 9 *{x))f{x)dx = J g*(x)0f(x))dx (8.579) 

or 

/ ^s'Wx - / 9*1*= - 9*/U,„,„ - / %fdx 

= f a-P)g*)fdx (8.580) 

This says that /3 + = -(3, which we can also see from the Hermitian momentum 
operator (/3 + = (~ip) + = ip + = ip = -$). We have assumed that g,f~* 0 at the 
limit points ( this is required for hermiticity). 


Therefore, since 


1 

7^ 


_d 

dx 



1 

71 


4-*f 

Vo 


( 8 . 581 ) 
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we get 


(«T = -4 


1 


n/2 


1 

d__^o 

\/2 

dx ipo 


1 

7! 


/3-f 

Vo, 


Similarly we have (o ) + = d + which says that 

Hq = a + a~ = a + (a + ) + = ( a~) + a~ 


(8.582) 


(8.583) 


which is the same result we had in the harmonic oscillator case, i.e., the Hamil¬ 
tonian is expressed as the square of an operator. 


The next thing we need is the commutator 


[a ,a + ] = a a + -a + a = H\-Ho 


We find 


Id 2 d ip' 0 Id 2 

2 dx 2 ° dx 'll )o 2 dx 2 

= __d_V )o 

dx ipo 


(8.584) 


(8.585) 


In general, this commutator is a function of x for non-harmonic potential (it is 
equal to a constant for the harmonic oscillator). 


In order to work with this system of operators we need to figure out some of 
their properties. 

(1) We have 


d + a ipo = Hoipo = 0 

(V’ol H 0 \ip 0 ) = 0 = (V’ol a + a~ \i/) 0 ) 

This says that the norm [|a“ |'0o)l equals zero and hence that 

aril) o = 0 (8.586) 

(2) We have 

a + Hi - Hod + = d + d~a + - d + d~d + = 0 = a~Ho - HicT (8.587) 
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(3) Let the state ip„ be an eigenstate of Ho with eigenvalue E We then have 

Hoi’n ~ a + a _ ?/>° = E^ip^notag (8.588) 

a _ a + (o‘V’n) = ^(o'V’n) = Hi (a'V’n) 

which says that a _, f/v is and eigenstate of Hi with eigenvalue E ° (which is 
also an eigenvalue of Ho). This is true except for the ground state where 

a, - -00 = 0 . 

(4) Let the state ip^ be an eigenstate of Hi with eigenvalue E [ n . We then have 

Hiipi = aTa+tpn = E^ 

a + a~ (a + ipn) = E^(a + ^°) = H 0 {a + ^° n ) 

which says that a~ip„ is and eigenstate of Ho with eigenvalue E\ (which 
is also an eigenvalue of Ho). 

This means that the eigenvalue spectrum of the two Hamiltonians Hq and 
Hi can be derived from each other as shown in Figure 8.25 below. 



Figure 8.25: Eigenstate/Eigenvalue Relationships 

Hq and Hi have the same energy level spectrum E®,E^ except that the 
zero energy ground state of Hq, Eg, has no counterpart for Hi. 

The arrows indicate the operator connections between the state vectors. 
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This is an extremely important result. 


If only one of the Hamiltonians is exactly solvable or more easily treated 
using approximation methods, then the solutions for the other Hamilto¬ 
nian can be obtained from the solutions of the solvable member of the 
pair. This is the meaning of supersymmetric partners. 


(5) Now consider the normalization condition. We have 

(Vil Hi K) = (^n\a~a + \ti) = (ti\El |V> l ) = El | <) (8.589) 

Therefore, if (ipl | ipl) = 1 (normalized to 1), then, if we let |a) = a + \ipl), 
we must have 

{a | a) = El (8.590) 

The state |a) = a + k/d) is not normalized. We can insure normalized states 
in successive steps by choosing 





and |^) 



(8.591) 


We now fix one part of the notation by letting - ipo and define 


$ = - 


r 0 


(8.592) 


which implies that 


Vo 


V! 


1 

72L 


T— + $(x) 

ax 


/, o'- 1 


2 V'S 0 L J 




(8.593) 

(8.594) 

(8.595) 


A compact matrix notation for both Hamiltonians is 



(o' 

0 

H 0 

) ■ (a 

2 + -$ 2 ) / + -t>'a 

2 / 2 

(8.596) 

where 




q 

t-i 

II 

O 

^ 0 



p = 

. d 
~ l dx 

and 

(8.597) 

Now the equation 









$ = - 

r 0 ' 

< 

(8.598) 
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says that 


- <&{x)dx 


dip o 

4° 


- J &(x)dx = ln(^o) _ ln(H) 
ipo(x) = A exp j^- J~ i>(x)dx 


where A is a constant given by the normalization of %p q. 


(8.599) 

(8.600) 
(8.601) 


This result can inverted, i.e., we can specify $( 0 ;), which then gives ipQ and a 
pair of Hamiltonians to study. 


8.14.2. Examples 

(1) Harmonic Oscillator - We choose 4>(x) = uix. This gives the two poten¬ 
tials 


p 0 = i [-$' + $ 2 ] = i [-w + wV] 
Vi = i [$' + $ 2 ] = ^ [w + wV] 
and the two Hamiltonians 


Ho = - 
H\ = — 


+ Vn - - - 


1#_ 

2 dx 2 

1 d 2 T/ 

2d^ + Vl = 


Id 2 1 2 2 1 

+ -war-w 


2 dx 2 2 
1 d 2 1 
2d^2 + 2 


2 2 ^ 

+ — (U X + —CJ 


(8.602) 

(8.603) 


(8.604) 

(8.605) 


which corresponds to two harmonic oscillators with zero point energies 
differing by w. We then find 


n/2 


+ -^ + $(x) 
dx 


1 

7^ 


d 

— + wx 
ax 


Now we must have 


\/2d ^>0 = 


d 

— -H UJX 

dx 


< = 0 


which is easy to solve. We get 
dipo(x ) 




=-uxdx 


/ 


dipp(x) 

iPo(x) 


•I 


xdx 


1 


ln^g(x) - InH = --wx' 
ipo(x) = Aexp(-^wx 2 ) 


(8.606) 


(8.607) 


(8.608) 
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(8.609) 

(8.610) 



as the ground state of Hq. Since H\ differs from Hq only by a shift of oj, 
its lowest eigenstate corresponds to 


ip{(x) = Aexp(--Lox 2 ) 


E{ =w 


(8.611) 


The hrst excited state of Hq is obtained using a + . We have a + ip\ = a + ipQ = 
ipi, so that 


V’ 


o 

1 


1 

72 L 

1 

71 


dx 

d_ 

dx 


+ to X 


V’i 


+ UIX 


Aexp(--ujx 2 ) 


Ax exp (--ujx 2 ) 


(8.612) 


with E ° = w, and so on. Iterating in this fashion we can generate the 
entire solution. It clearly agrees with our earlier results for the harmonic 
oscillator. 


(2) Reflection-Free Potentials - Let us investigate the case given by 

$(cc) = tanh(x) (8.613) 

We then have 




1 


Thus, the potential 


cosh 2 x J 

1 

cosh 2 x 


’ V ' = l 


(8.614) 


(8.615) 


has the constant potential 1/2, which corresponds to a free particle , as its 
supersymmetric partner. 


We can find the normalized ground state eigenfunction of Hq using 




A exp | - J &(x)da 


A 


cosh(cc) 


A exp [- log(cosh(a;))] 


Now, normalization gives 


/ m 2 dx-^A-± 


< = 


1 1 

>/2 cosh(x) 


(8.616) 


(8.617) 


The ground state energy is Eg = 0 (by definition). The energy eigenstates 
for 


Hi 


1 d 2 1 

2~chA + 2 


(8.618) 
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are given by 



(8.619) 

(8.620) 

which has solutions 

^(x) = e^ 

(8.621) 

where 

k 2 = 2(^-1) or El= l -{\ + k 2 ) 

(8.622) 

From the formalism, the remaining normalized eigenfunctions of Hq are 
given by 


4>k( x ) 










y/2 L dx 


- — + tanh(x) 


ikx 


(~ik + tanh(s)) pikx 

V^TTF) 


(8.623) 


The corresponding energy eigenvalues are also Ej.. These continuum states 
have the property that they do not possess reflected waves, hence the name 
reflection-free potentials. 


8.14.3. Generalizations 

While the SUSY method seems to be very powerful, one wonders - how does 
one find the $(x) functions relevant to a particular Hamiltonian H that one 
wants to solve? 

We can see how to do this by re-deriving this result in a different way. 

We consider a general Hamiltonian H and define a set of operatorsryi , 7 ) 2 , 1 %, . 

and a set of real constants Ei, E 2 , E$, . such that they satisfy the recursion 

relations 


V1V1 + Ei = H 
1 ) 21)2 + E 2 = 1)1 i)t + Ei 
film + e 3 = 172 + e 2 
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or, in general 


Vj+iVj+i + E j+i = VjVj + E j , J = 1,2,3,. (8.624) 

Theorem - If each rjj has an eigenvector |£j} with eigenvalue equal to zero such 
that 

%fe> = 0 (8.625) 

then 

(a) the constant Ej is the j th eigenvalue of H (arranged in ascending order) 

Ei = ground state energy, E 2 = 1 st excited state energy, . (8.626) 

(b) the corresponding eigenvector is (ignoring normalization) 

= . Vi- ilO) (8-627) 

Before proving this theorem, let us talk about the meaning of the theorem. 

Statement (a) implies not only that Ej is an eigenvalue, but also that there is 
no eigenvalue between £j-i and Ej, i.e., if E is an eigenvalue of H, then E 
equals one of the Ej or else E is larger than all of the Ej. 

If we introduce E max = least upper bound of the sequence E -\, E 2 , E 3 ,.then 

the theorem gives all the eigenvalues below E max . If E max = oo (as in the 
harmonic oscillator), then this theorem implies all of the eigenvalues. If E max - 0 
(as in the hydrogen atom), then this theorem implies all negative eigenvalues. 

The theorem yields all parts of the discrete spectrum and says nothing about 
the continuous part of the spectrum (if it exists). 

Proof of the Theorem 

We first define new operators 

Aj = f/jfij + Ej (8.628) 

so that using the recursion relations we get 

A j+ 1 = Vj + 1 r)j+i + E j+ 1 = fjjfjj + Ej for all j (8.629) 

We then have 

Ai+iVj = (w} + Ej) % = fij ( f)ji)j + Ej) = fjjAj 
AiVj = {v}Vj + Ej ) fjj = fij {fjffj + Ej) = ifjA j+ 1 
H = Ai = rji tji + Ex 
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Therefore 


H | Ej) = .»#_i 10) = Vi A 2V 2 V3 . Vj -1 10) 

= 77i ? ? 2^3 ? ?3.?7j_i |0) =. and so on until 

= ViV2^3ri3 . ^-lA/lO) 

But our assumption fjj |0} = 0 then gives 

A J fo> = + Ej) 10) = //;(//, 10)) + Ej 10) = Ej 10) (8.630) 

or |0) is an eigenstate of Aj with eigenvalue Ej . This gives 

H\Ej)=Ej\Ej) (8.631) 

or | Ej) is an eigenstate of H with eigenvalue Ej. 

Now consider the quantity Ej+i - Ej and assume that (0+i I 0+i) = 1- We then 
have 

Ej+i ~ Ej = (0+iI (Ej +1 - Ej) |0+i) = (0+iI iVjVj ~ Vj+iVj+i) 10+1) 

= (0+il (VjVj 10+1 > since Vj+ 110+1 > = 0 (8.632) 

Now let |/3) = fjj |0+i)• We then have Ej +1 - Ej = (/3|/3) > 0, which says that 

E\ < E 2 < E$ < E 4 . <. (8.633) 

Finally, consider the eigenvector \E) of H and define the vector 

I Pn) = Vnfln-lVn-2Vn-3 . V 2 V 1 \ E ) (8.634) 

We then have 

0<{pi\pi) = (E\ v + lV \ | E) = (E\ (ir - Ex) | E) 

= (E\ ( H - Ei)KetE = (E\ {E - Ex) \E) (8.635) 

or 

E-Ex>0->E>Ex (8.636) 

Similarly, 

0< {p 2 1P 2 ) = (E\ri^ 7 ] 2 m \ E ) = (E\ V +(A 2 -E 2 )rh\E) 

= (E\i)^(A 2 rf 1 - rj\E 2 ) \E) = (E\ r)l(rjx A 2 - rfxE 2 ) \E) 

= (E\ r,+rjx{H - E 2 ) \E) = (E - E 2 ) (E\ ft ft \E) 

= (E-E 2 )(E-Ex) (8.637) 

But, since E > Ex, this says that E > E 2 . Generalizing we have 

0 < (P 2 1 P 2 ) = (E - E n )(E - E n _x) . (E - E 2 )(E - Ex) (8.638) 
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which implies that E > Ej for all j or E = one of the Ej, which concludes the 
proof. 

This theorem does not, however, tell us how to find the operators. 

The crucial step in the method is the factorization of the Hamiltonian H , i.e., 

H = fjifji + Ei -*• H - Ei = fj+f)! = ’’square” of an operator (8.639) 

Once we choose fji, it is not difficult to construct the others. 

During the construction, any adjustable parameters must be chosen in such a 
way that all the are as large as possible, which then guarantees a unique solution 
to any problem. To see that this requirement is necessary, consider a change in 
f/j and Ej with fjj-\ and Ej-\ held fixed. Now 

0 + Vj + Ej = f)j-iVj-i + Ej. i (8.640) 

implies that 

S(VjVj) + $Ej = d(fi j -\fjj_ 1 + Ej. i) = 0 (8.641) 

when fjj-i and Ej- 1 are held fixed. Therefore, we get 

SEj = (f,| k;) - (tj\-8(f)}rij)\tj) = (^-| - (6r)+)f)j - fjj (6f)j)\£j} 

= - (^1 Wj )vj &} - (01 Vj (Sfjj) 10) = o - (oi Vj (5 f/j) |o) 
=-<oi(^roior=o 


This says that we are at an extremum or physically at a maximum. 

In the first derivation we simply guessed the function $(a;), calculated Vq{x) 
and then solved the problem corresponding to 

Ho = -^ + Vo(x) (8.642) 


t turns out, however, that for the particular form of H that we have been 
considering, we can do more than guess. In particular, for 



6 -L* v ^ 

(8.643) 

we can always choose 

%= /=— (p + tfj(x) 

V 2 m 

(8.644) 

where fj(x ) = a real function of the operator x. 
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8.14.4. Examples 

(1) Harmonic Oscillator revisited - The Hamiltonian for this system is 


We assume 


2m 2 


Vj = ~7==(P + ifj(x)) 
\J2m 


(8.645) 

(8.646) 


We then have, from earlier discussions, 


[x,p]=ih , [f(x),p]=ih < ¥j^ 

dx 


(8.647) 


The last commutator follows by operating on an arbitrary function g(x) 

[f(x),p]g(x ) = [f(x)p-pf(x)]g(x) 

= -ih[f{x)-^~ - ~^~f(x)]g(x) = -ih[fj- - 

ax ax ax ax 

■ xr fdg f dg df df 

= -^L/V - f~r ~ 9~r J = 

dx dx dx dx 

which gives the commutation relation. 

We then get (using p + = p, x + = x and f + (x ) = f*(x) = f(x)) 
fl} % = ^ (P ~ if j )(P +if j) 

1 ^9 1 «,o ^ r c 4 

“ 2 m P + 2 m j ~ 2 m 

= —v 2 i 1 f 2 i h 

2 m 2 m J j 2 m dx 


Similarly, 


We then have 


-'- s + 1 ^2 1 j.2 h dfj 

VjVj “ 2m ^ + 2 m^ ~ 2m~dx 


(8.648) 


-ff = 771 ^ 1+^1 


which implies that 


~2 1 f 2 . dfi 1 x 2 , 1 2 2 

2rri 2m dx 2m 2 

(8.649) 

l 2 . h dfi 1 22 

— 1 1 +-— = -mw x 

2m 2m dx 2 

(8.650) 

/i(x) = ±mwx , Ei = T-hui 

(8.651) 
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We need only find a particular solution that guarantees the existence of a vector 
| £j) such that fjj |£j) = 0 . 

Which sign does this imply? 


The choice of the minus sign guarantees an ascending order of eigenvalues. The 
choice of the plus sign implies a descending order and no lower limit to the set 
of eigenvalues, which makes no physical sense. 

So we choose 

fi(x) = -muix , Ei = + -hui 

(8.652) 

We now find f 2 using 



f) 2 fl2 + E 2 = f}iVi + E i 

(8.653) 


1 f2 , h 1 2 2.1 

- t 9 h-= -mui x + hui 

2 to 2 m dx 2 

(8.654) 

which gives 

3 

f 2 (x) = -muix , E 2 = +-hu> 

(8.655) 

Similarly, the recursion structure gives 



fj(x) = -mu>x , Ej = +(j - l/2)hui 

(8.656) 

Thus, all the fjj are identical 



fjj = (p - imujx) 

v 2 m 

(8.657) 

and 

1 Ej) {Vj Y 1 10) oc (P+ imux) 3 ~ 1 |£j) 

(8.658) 

where 

Vj 10 > = /x— (P ™™x) %) = 0 

V 2 m 

(8.659) 

Finally, we show that the state |£j) exists. We have 



^ 7 i lOi) = 0 - 9 - £ 1 } = the ground state 

(8.660) 

This says that 

(p - imuix ) |£ l ) = 0 

(8.661) 


(x| (p - imuix ) |£i) = (x|p|£i) - imuj (x\ x |£i) = 0 

(8.662) 


- ih — (x | £ 1 ) - imuix (x | £ 1 ) = 0 
dx 

(8.663) 
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This last differential equation says that 

(x |£i) = ^4 exp (8.664) 

which is the same ground state wave function as before! 

Does this procedure really work for all Hamiltonians H of this form? 

(2) One-Dimensional Infinite Square Well - This is probably the most 
difficult case to consider because the potential involved is not defined by an ex¬ 
plicit, analytic function of x. It only enters the problem implicitly via boundary 
conditions at the walls. This means that H has no explicit dependence on x. 
We must, however, somehow introduce a dependence on x in the factorization 
procedure. 

Let the well extend from x = 0 to x - L. We then have (inside the well) 



(8.665) 

( 8 . 666 ) 

(8.667) 


The possible choices of Ei and Xq are restricted by the behavior of the tangent 
function. In the position representation, a; is a number such that 0 < x < L. The 
tangent must remain finite in this interval. The singularities of tan y occur at 


7r 37 t 57t 

y ~ 2 ’ 2 ’ 2 ’ 


( 8 . 668 ) 


i.e., they are separated by 7r radians. We attain the largest possible value of E\ 
if one of the singularities lies at x = 0 and the next at x = L (it is permissible for 
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the tangent to have a singularity at these points (the walls) since the potential 
and hence H are also infinite there. So we assume that 


at x = 0 

at x = L 


tan 

tan 


x 0 


\/2mEi 

h 

\/2mEi 


(L-x o) 


oo 


which imply that 


\/2mEi 

h 



\/2mEi 2>n 


\/2mEi 

h 


7 r 
2 


(8.669) 

(8.670) 


(8.671) 


or 


This gives 


and 


\/2mEi 


L = 7T 


£i = 


ft 2 ^ 2 

2mL 2 


fi = -\j2mE\ tan 


\/2mEi 


h 


( a 


nh 


2\J2 mE 


■nh 

= — tan 

lJ 


7T 7T 

~L X ~2 


nh sin[-^a:- f] 
T cos \-f/x ~ f ] 


7rftcos(^) 7rft /TraJx 

TSW)*T cot( T ) 


This implies that 


1 


m = 


\j2m 

Reflecting on this result we now choose 

1 


nh nx 

p + i -cot- , 

L LI 


Vj = 


\j 2 m 


[p + iCj cot (bjx)] 


(8.672) 


(8.673) 


(8.674) 


(8.675) 


(8.676) 


where Cj and bj are to be chosen to give the correct recursion relations. Now, 
in order to guarantee 0 < x < L we must have 0 < bj < n/L. If we apply the 
recursion relations we get 


Vj+iVj+i = ^ \p 2 ~ c j+i b j+ih + Cj + i(c j+ i - b j+1 h ) cot 2 b j+1 x\ (8.677) 
rjjfjj - —— [p 2 - Cjbjh + Cj{cj + bjh ) cot 2 (8.678) 
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which implies that 


^ [p 2 - c j+1 b j+1 h + c j+1 (c j+1 - bj +1 h) cot 2 b j+1 x\ + E j+1 

= —— \p 2 - Cjbjh + Cj(cj + bjh) cot 2 bjx] + Ej (8.679) 

2 m 

Equating powers of x gives 



bj+i = bj - bj = &i = ^ 

(8.680) 


Cj+ i(cj+i - bj + 1 h) = Cj(cj + bjh) 

(8.681) 


2mEj + i - Cj+ibj+ih = 2mEj - Cjbjh 

(8.682) 

or 

2mEj - ( Cj ) 2 = 2mEi - (ci) 2 = 0 

(8.683) 


Ej J^ 

2 m 

(8.684) 

and 

nh 7r h 

c j+i\ c j+i ~j~) ~ c j\ c j + ~j~) 

(8.685) 

which gives 

irh 

Cj + i = —Cj or Cj +1 — Cj h — 

(8.686) 

The last choice implies the largest E\. Therefore, 



.-nh j 2 n 2 h 2 

‘" 3 L “ iE >' 2 mV 

(8.687) 

Finally, we show the existence of |£j) where fjj |Cj) = 0. This corresponds to the 

equation 

d .jnh /nxY\ . . , . 

r ft * +I :r co '(T)H { ’ )=0 

(8.688) 

or 

(x|^) = (sin^) 

(8.689) 

Now 

1 Ej) = V 1 V 2 . Vj-i\Zj) 

(8.690) 

implies that 

ipj(x) = (x | Ej) = (x| r)lr )2 . Vj-i l£i) 

(8.691) 

Therefore, 

7 TT 

4 >i(x) = (x | Ex) = (x | Ci) = sin — 

(8.692) 
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(8.693) 


ip 2 (x) = (x | E 2 ) = fj[(x | &) 


r d 

,7T h 1 

f irxY 

. 2 KX 

. 27 TX 

-1 ri- 

+ 1 — cot 

— 

sin — ' 

- sin- 

dx 

L ' 

y, L 

L 

L 


Similarly, 


and so on. 


^3(2’) = rjiV 2 (2’ I 6) 


. 37rcc 
sin- 


L 


(8.694) 


We also find 


where 


V 0 (x) 


2 V’o 



or 


In addition, 


V 0 (x) = 


1 7T 2 

2L 2 


. 1 ib'n 1 7T 7ra; 

W(x) =—=—= —— — cot — 
V2i>0 V 2 L L 


Finally, the superpartner potential is 


d ^ 


dx ipo 


1 7 t d 


Vi(x) = V 0 (x) - 3 -— = \/ y sin — + —cot — 


L s/2 L dx 


This completes the solution. The method works even in this case! 


(8.695) 

(8.696) 

(8.697) 

(8.698) 

(8.699) 


(3) Hydrogen Atom - As we will see in later discussions(Chapter 9), when 
we write down the 3-dimensional Schrodinger equation for the hydrogen atom 
and separate the variables, we obtain a 1 -dimensional equation in the variable 
r corresponding to radius. This equation looks like 


Htu n e(r ) 


J _, 2 fo 2 l(l+l) 
2 m? r 2 mr 2 



u„e(r) = E ne u n e(r) 


(8.700) 


The factorization method should also work on this equation. We choose 


V 1 V 1 = Pr + 


)2 ft 2 l(l+l) _ e 2 


2 mr 2 


(8.701) 

(8.702) 


For convenience, we have suppressed the ell dependence (fjj f/j) . Now assume 


% = 



(8.703) 
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which gives 


^ = L 

, AJ . i 
2 m 


VjVj 


p 2 r + b 2 + 2 bj — + — (Cj - ft) 
p 2 + b 2 + 2 bj — + — (Cj + ft) 


The definition of the Hamiltonian gives 


1 


p 2 , + b\ + 2 &i— + —(ci - ft) 
r r 


+ E\ 


„ 2 h 2 l(£ + 1 ) e 

2 m^ r 2 mr 2 r 


Equating powers of r gives 


- 1 -ci(ci - ft) = + l)ft 2 

zm zm 

felCl _ _ g2 

m 

2 in 


+ E 1 =0 


which imply 


ci = (£ +l)h 
me 2 

1 = " (£+l)h 


E\ = — 


The recursion relations then give 


1 / me 2 \ 2 

2m \ (£ + 1 )ft j 


c j+1 (cj +1 - ft) = 0 , (9 + ft) 

bj+iCj+i — bj Cj 


bU 

2 m 


+ Ej + i - 


2 m 


+ /t. 


which imply 


Cj+i = Cj + ft Cj = (j - l)ft + Ci -* Cj = (i + j)h 
bjCj = ftiCi = - me 2 


Ei=- 


b 2 


1 


2 to 2 ?n 
If we let n = £ + j we have 


/ ?ne 2 V 
W + j)h) 


2 n 2 


(8.704) 

(8.705) 


(8.706) 

(8.707) 

(8.708) 

(8.709) 

(8.710) 

(8.711) 

(8.712) 

(8.713) 

(8.714) 

(8.715) 

(8.716) 

(8.717) 

(8.718) 

(8.719) 
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where 


a = — = the fine structure constant 
he 

Finally, we get the eigenfunctions. We must have rf'P |£j) = 0. Using 


3 \/2 m 


\j2m 


p r +1 


p r ■ 


'ij ^ i 

me 2 (£ + j)h 


(£ + j)h r 
ih i(£ + j)h 


(£ + j)a 0 r 


where 


we get 


where 


ao = -- = the Bohr radius 

me- 


, ih i(£ + j)h 

Pr-~ --- + 


(£ + j)a 0 r 

Pr = -ih 


I0> = o 


d 1 

— + - 

dr r 


This implies 


which has a solution 


We thus have 


' 

r d li 

-ih 

— + - 
. dr r . 


ih i(£ + j)h 


(n,£\£j) = 0 


(n, £ | (j) = r e+3 1 e < f +i)a 0 ~ ( r ) 


V’i°^( r )~ e “° j V’2°^( r ) ~ _ 

and so on, which are the correct wave functions! 


r 

-e 2a o 

2a>o 


(8.720) 


(8.721) 

(8.722) 

(8.723) 

(8.724) 

(8.725) 

(8.726) 

(8.727) 


8.14.5. Supersymmetric Quantum Mechanics 

This is the study of the relationship between pairs of superpartner potentials. 

Some of the issues addressed are: 

1. To what extent can the observed energy spectrum determine the potential 
energy function. In other words, can we invert the energy levels and find 
the V (a:) that generated them. Is the inversion unique? 

2. The number of potentials for which we can analytically solve the Schrodinger 
equation is small. They are the infinite well, the harmonic oscillator (1-, 
2- , and 3-dimensions), the Coulomb potential and a few others. 
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All of these potentials can also be solved using the factorization method 
operator techniques. Thus, they all have supersymmetric analogs. All the 
superpartner potentials are similar in shape, differing only in the param¬ 
eters in their definition. 

3. Supersymmetric quantum mechanics suggests a connection between fermions 
and bosons at some fundamental level. 

In supersymmetric quantum mechanics we study quantum mechanical systems 
where the Hamiltonian H is constructed from anticommuting charges Q which 
are the square root of H , i.e., 

2 H = {Q, Q + } = QQ + +Q + Q (8.728) 

0 = {Q,QH<3 2 = 0 (8.729) 

These equations imply that 

[H,Q] = [Q,H]= l -[Q,QQ + + Q + Q] 

= \ ( Q 2 Q + + QQ + Q - QQ + Q - Q + Q 2 ) = 0 

or that the charge Q is a conserved observable. 

These Hamiltonians contain coordinates that are quantized by commutators 
(bosonic coordinates) and anticommutators (fermionic coordinates). These co¬ 
ordinates are mixed by supersymmetry transformations. 

For a particle with spin, the position and spin orientation form a pair of such 
coordinates. An explicit example is given by 

Q = (p + , Q + = (p - iip(x)) (8.730) 

where x and p are bosonic coordinates (degrees of freedom) and ifi and ip + are 
fermionic coordinates (we have set h= 1 ) satisfying 

[x,p\=i , {V^ + } = 1 , = {^ + ,^ + } = 0 (8.731) 

i.e., we have commutators for the bosonic coordinates and anticommutators for 
the fermionic coordinates. 

From these relations we get 


and 


{Q,O} = {Q + ,Q + } = 0 


(8.732) 


H= ^P 2 + \[4>A + ]v'(x) 


(8.733) 
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Using a 2 x 2 representation of the fermionic coordinates , we have 


(8.734) 

(8.735) 

(8.736) 


ij) + = <j_ = 


(::) ■ 


= o-+ = 


{■)/), ?/> + } = <J + (7_ + (J_(7+ = I 

[W + ] =-** = ( Q j j 


1 

0 


so that 

H = \ (f + 7 > 2 (£)) + 

= 77o + Hi 

= Hose sector + Fermi sector 

The two sectors have the same energy levels. The only exception is the case 
where the ground state of the Bose sector has zero energy and is thus non¬ 
degenerate. 


8.14.6. Additional Thoughts 

Supersymmetry transformations are represented by the unitary operator 

U = exp(eQ + e + Q + ) (8.737) 

where e and e + are anti-commuting c-numbers (called a Grassman algebra). 

Supersymmetric one-particle quantum mechanics serves as a model for the inves¬ 
tigation of spontaneous breaking of supersymmetry, which is supposed to occur 
in supersymmetric field theories. The ground state |0) is invariant with respect 
to supersymmetry transformations provided that U |0) = |0). This is satisfied if 
and only if Q |0) = Q + |0) = 0, that is, if the ground state energy is zero. 

If the ground state energy is greater than zero, then supersymmetry is sponta¬ 
neously broken. 


An example of spontaneously broken symmetry is 

$ = g(x 2 - a 2 ) (8.738) 

-2 2 

H= — + — {x 2 -a 2 )+gxa z (8.739) 

2 m 2 

The two potentials satisfy Vo(-x) = -Vi(x). There is no normalizable state in 
this case with Eqq = 0 since 




(8.740) 
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says that the ground state energy is positive. 

The world we actually observe is in one of the degenerate ground states and the 
supersymmetry is spontaneously broken! 


8.15. Problems 

8.15.1. Delta function in a well 

A particle of mass m moving in one dimension is confined to a space 0 < x < L by 
an infinite well potential. In addition, the particle experiences a delta function 
potential of strength A given by XS(x - L/2 ) located at the center of the well as 
shown in Figure 8.1 below. 



Figure 8.26: Potential Diagram 

Find a transcendental equation for the energy eigenvalues E in terms of the 
mass to, the potential strength A, and the size of the well L. 

8.15.2. Properties of the wave function 

A particle of mass in is confined to a one-dimensional region 0 < x < a (an 
infinite square well potential). At t = 0 its normalized wave function is 

, . , / 8 / (wx \\ . ( nx \ 

ip(x,t = 0 ) = a / — (1 + cos I — II sin I — 

V 5 a\ \ a )) \ a ) 

(a) What is the wave function at a later time t = to? 

(b) What is the average energy of the system at t = 0 and t - <o? 

(c) What is the probability that the particle is found in the left half of the 
box(i.e., in the region 0 < x < a /2 at t = f 0 ? 
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8.15.3. Repulsive Potential 

A repulsive short-range potential with a strongly attractive core can be approx¬ 
imated by a square barrier with a delta function at its center, namely, 

V(x ) = Vo©(M - a) - ^S(x) 

2 m 

(a) Show that there is a negative energy eigenstate (the ground-state). 

(b) If Eq is the ground-state energy of the delta-function potential in the 
absence of the positive potential barrier, then the ground-state energy of 
the present system satisfies the relation E > Eq+Vo- What is the particular 
value of Vo for which we have the limiting case of a ground-state with zero 
energy. 


8.15.4. Step and Delta Functions 

Consider a one-dimensional potential with a step-function component and an 
attractive delta function component just at the edge of the step, namely, 

V(x) = VQ(x) - — -S(x) 

2m 

(a) For E > V, compute the reflection coefficient for particle incident from the 
left. How does this result differ from that of the step barrier alone at high 
energy? 

(b) For E < 0 determine the energy eigenvalues and eigenfunctions of any 
bound-state solutions. 


8.15.5. Atomic Model 


An approximate model for an atom near a wall is to consider a particle moving 
under the influence of the one-dimensional potential given by 


V(x ) 


-Vo^(.t) x > -d 
oo x < -d 


as shown in Figure 8.2 below. 


(a) Find the transcendental equation for the bound state energies. 

(b) Find an approximation for the modification of the bound-state energy 
caused by the wall when it is far away. Define carefully what you mean 
by far away. 

(c) What is the exact condition on Vo and d for the existence of at least one 
bound state? 
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Figure 8.27: Potential Diagram 


8.15.6. A confined particle 

A particle of mass m is confined to a space 0 < x < a in one dimension by 
infinitely high walls at x - 0 and x = a. At t = 0 the particle is initially in the 
left half of the well with a wave function given by 


ip(x,0) 


f \[2/a 0 < x < a/2 

1 0 a/2 < x < a 


(a) Find the time-dependent wave function i/)(x,t). 

(b) What is the probability that the particle is in the n th eigenstate of the 
well at time t? 

(c) Derive an expression for average value of particle energy. What is the 
physical meaning of your result? 


8.15.7. 1/x potential 

An electron moves in one dimension and is confined to the right half-space 
(x > 0) where it has potential energy 

where e is the charge on an electron. 

(a) What is the solution of the Schrodinger equation at large x ? 

(b) What is the boundary condition at x = 0? 
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(c) Use the results of (a) and (b) to guess the ground state solution of the 
equation. Remember the ground state wave function has no zeros except 
at the boundaries. 

(d) Find the ground state energy. 

(e) Find the expectation value ( x) in the ground state. 

8.15.8. Using the commutator 

Using the coordinate-momentum commutation relation prove that 
Y ( E n - E o ) \{E n \x \E 0 )\ 2 = constant 


where Eq is the energy corresponding to the eigenstate \Eo). Determine the 
value of the constant. Assume the Hamiltonian has the general form 

H= — + V(x) 

2 TO V ' 

8.15.9. Matrix Elements for Harmonic Oscillator 

Compute the following matrix elements 

(m|o; 3 |?r} , (m\xp\n) 

8.15.10. A matrix element 

Show for the one dimensional simple harmonic oscillator 
(0| e ik& |0) = exp [-k 2 (0| x 2 |0) /2] 
where x is the position operator. 


8.15.11. Correlation function 

Consider a function, known as the correlation function, defined by 

C(t) = (£(t)x( 0)} 

where x(t ) is the position operator in the Heisenberg picture. Evaluate the 
correlation function explicitly for the ground-state of the one dimensional simple 
harmonic oscillator. 
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8.15.12. Instantaneous Force 

Consider a simple harmonic oscillator in its ground state. 

An instantaneous force imparts momentum po to the system such that the new 
state vector is given by 

1 1/>) = e - ipo * lh | 0 > 

where |0) is the ground-state of the original oscillator. 

What is the probability that the system will stay in its ground state? 

8.15.13. Coherent States 

Coherent states are defined to be eigenstates of the annihilation or lowering 
operator in the harmonic oscillator potential. Each coherent state has a complex 
label 2 and is given by | z) = e za |0). 

(a) Show that a\z) = z\z) 

(b) Show that (zi | z-i) - e ZlZ2 

(c) Show that the completeness relation takes the form 

/ = X»H= f — 

where |n) is a standard harmonic oscillator energy eigenstate, / is the identity 
operator, z = x + iy , and the integration is taken over the whole x — y plane(use 
polar coordinates). 

8.15.14. Oscillator with Delta Function 

Consider a harmonic oscillator potential with an extra delta function term at 
the origin, that is, 

V(x) = -mw 2 i 2 + —-S(x) 
y J 2 2m 

(a) Using the parity invariance of the Hamiltonian, show that the energy 
eigenfunctions are even and odd functions and that the simple harmonic 
oscillator odd-parity energy eigenstates are still eigenstates of the system 
Hamiltonian, with the same eigenvalues. 

(b) Expand the even-parity eigenstates of the new system in terms of the 
even-parity harmonic oscillator eigenfunctions and determine the expan¬ 
sion coefficients. 
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(c) Show that the energy eigenvalues that correspond to even eigenstates are 
solutions of the equation 


2 

9 


h ~ (2fc)! 

mmo l^ 0 2 2k (k\) 2 


( 2k+ - ~ r~) 

V 2 hui) 


You might need the fact that 


^2fe(0) 


/wi 1 / 4 VWfi 

V 7T h ) 2 k k\ 


(d) Consider the following cases: 

(1) g>0, E>0 

(2) g < 0, E > 0 

(3) g < 0, E < 0 

Show the first and second cases correspond to an infinite number of energy 
eigenvalues. 

Where are they relative to the original energy eigenvalues of the harmonic 
oscillator? 


Show that in the third case, that of an attractive delta function core, there 
exists a single eigenvalue corresponding to the ground state of the system 
provided that the coupling is such that 

r (3/4)j 2 ; g 2 fa 
T(1/4)J 16mw 

You might need the series summation: 

~ (2fc)! 1 = \/n r( L/2 - x/2) 

j^ 0 4 fc (fc!) 2 2k+l - x ~ 2 r(l - x/2) 

You will need to look up other properties of the gamma function to solve 
this problem. 


8.15.15. Measurement on a Particle in a Box 

Consider a particle in a box of width a, prepared in the ground state. 

(a) What are then possible values one can measure for: (1) energy, (2) posi¬ 
tion, (3) momentum ? 

(b) What are the probabilities for the possible outcomes you found in part 
(a)? 
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(c) At some time (call it t = 0) we perform a measurement of position. How¬ 
ever, our detector has only finite resolution. We find that the particle is 
in the middle of the box (call it the origin) with an uncertainty Aa; = a/2, 
that is, we know the position is, for sure, in the range -a/4 < x < a/4, but 
we are completely uncertain where it is within this range. What is the 
(normalized) post-measurement state? 

(d) Immediately after the position measurement what are the possible values 
for (1) energy, (2) position, (3) momentum and with what probabilities? 

(e) At a later time, what are the possible values for (1) energy, (2) position, 
(3) momentum and with what probabilities? Comment. 

8.15.16. Aharonov-Bohm experiment 

Consider an infinitely long solenoid which carries a current / so that there is a 
constant magnetic field inside the solenoid(see Figue 8.3 below). 
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Suppose that in the region outside the solenoid the motion of a particle with 
charge e and mass m is described by the Schrodinger equation. Assume that 
for I = 0 , the solution of the equation is given by 

(a) Write down and solve the Schrodinger equation in the region outside the 
solenoid in the case I + 0. 

(b) Consider the two-slit diffraction experiment for the particles described 
above shown in Figure 8.3 above. Assume that the distance d between 
the two slits is large compared to the diameter of the solenoid. 

Compute the shift AS of the diffraction pattern on the screen due to the 
presence of the solenoid with I + 0. Assume that L » AS. 

8.15.17. A Josephson Junction 

A Josephson junction is formed when two superconducting wires are separated 
by an insulating gap of capacitance C. The quantum states , * = 1,2 of 
the two wires can be characterized by the numbers n,; of Cooper pairs (charge 
= -2e) and phases 0i, such that if>i - sjnle™ 1 (Ginzburg-Landau approximation). 
The (small) amplitude that a pair tunnel across a narrow insulating barrier is 
-Ej/no where no = ni + n 2 and Ej is the the so-called Josephson energy. The 
interesting physics is expressed in terms of the differences 

n = n 2 -rii , <p = 0 2 - 6*i 

We consider a junction where 


ni « «2 « ng/2 


When there exists a nonzero difference n between the numbers of pairs of charge 
-2e, where e > 0, on the two sides of the junction, there is net charge -ne on side 
2 and net charge +ne on side 1. Hence a voltage difference ne/C arises, where 
the voltage on side 1 is higher than that on side 2 if n = n 2 - n\ > 0. Taking the 
zero of the voltage to be at the center of the junction, the electrostatic energy 
of the Cooper pair of charge -2e on side 2 is ne 2 /C, and that of a pair on side 
1 is -ne 2 /C. The total electrostatic energy is C(AV) 2 /2 = Q 2 /2C = (ne) 2 /2C. 


The equations of motion for a pair in the two-state system (1,2) are 


..dip! Ej ne 2 Ej 

in—— = Uiipx - ip 2 = —%pi - ip 2 

clt. no G no 

ih ~ M L = V 2 ^ 2 ~ = " W “^ 2 ~ 

at. no G no 

(a) Discuss the physics of the terms in these equations. 
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r n~ie l6i , show that the equations of motion for n and ip are 


(b) 


Using fa = 
given by 


p = 0 2 - 0i 


fi= fi2~ hi 


2 ne 2 


hC 

Ej . 

—— sin Lp 
h 


(c) Show that the pair(electric current) from side 1 to side 2 is given by 

ttEj 


Js = Jo sin tp 


Jo - 


Po 


(d) 


Show that 

.. 2 e 2 Ej . 

LD RS--- Sin LD 

r h 2 C r 


For Ej positive, show that this implies there are oscillations about tp = 0 
whose angular frequency (called the Josephson plasma frequency)is given 

by 


wj = 




2 e 2 Ej 
h 2 C 


for small amplitudes. 


If Ej is negative, then there are oscillations about ip = n. 


(e) If a voltage V = V\ - V 2 is applied across the junction(by a battery), a 
charge Q\ - VC = (-2e)(-n/2) = en is held on side 1, and the negative of 
this on side 2. Show that we then have 

2eU 

(p « —-— - -u> 
h 


which gives p> - tot. 


The battery holds the charge difference across the junction fixed at VC - 
en, but can be a source or sink of charge such that a current can flow in 
the circuit. Show that in this case, the current is given by 


Js = - Jo sinwf 


i.e., the DC voltage of the battery generates an AC pair current in circuit 
of frequency 


2eU 


U) = 


8.15.18. Eigenstates using Coherent States 

Obtain eigenstates of the following Hamiltonian 

H = huja + a + Va+ V*a + 
for a complex V using coherent states. 
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8.15.19. Bogoliubov Transformation 

Suppose annihilation and creation operators satisfy the standard commutation 
relations [a, a + ] = 1. Show that the Bogoliubov transformation 

b = a cosh + a + sinh 77 

preserves the commutation relation of the creation and annihilation operators, 
i.e., [6, b + ] = 1. Use this fact to obtain eigenvalues of the following Hamiltonian 

H = hoj(a) + a + — V (aa + a + a + ) 

(There is an upper limit on V for which this can be done). Also show that the 
unitary operator 

JJ _ e (aa+a + o + )?)/2 

can relate the two sets of operators as b = UaU~ l . 

8.15.20. Harmonic oscillator 

Consider a particle in a 1-dimensional harmonic oscillator potential. Suppose 
at time t = 0 , the state vector is 

( 0 ( 0 )) = e~^ | 0 ) 

where p is the momentum operator and a is a real number. 

(a) Use the equation of motion in the Heisenberg picture to find the operator 

x(t). 

(b) Show that e~^~ is the translation operator. 

(c) In the Heisenberg picture calculate the expectation value (x) for t > 0. 

8.15.21. Another oscillator 

A 1-dimensional harmonic oscillator is, at time t = 0, in the state 
|^(t = 0 )) = -^(| 0 ) + |l) + | 2 » 

where | n) is the n th energy eigenstate. Find the expectation value of position 
and energy at time t. 
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8.15.22. The coherent state 


Consider a particle of mass m in a harmonic oscillator potential of frequency oj. 
Suppose the particle is in the state 


oo 


l«> = E \ n ) 

n=0 


_ -|a| 2 /2_^ 


where 


and a is a complex number. As we have discussed, this is a coherent state or 
alternatively a quasi-classical state. 

(a) Show that |a) is an eigenstate of the annihilation operator, i.e., a|ai} = 
a |a). 

(b) Show that in this state ( x) = a’ c Re(a) and ( p) = p c Im(a). Determine x c 
and p c . 


(c) Show that, in position space, the wave function for this state is ip a (x) = 
e lpuX / h ' U ()(x - xq) where uo(x) is the ground state gaussian function and 
(x) = x 0 and (p) = p 0 . 

(d) What is the wave function in momentum space? Interpret Xq and po . 

(e) Explicitly show that ij} a {x) is an eigenstate of the annihilation operator 
using the position-space representation of the annihilation operator. 

(f) Show that the coherent state is a minimum uncertainty state (with equal 
uncertainties in x and p , in characteristic dimensionless units. 

(g) If a time t = 0 the state is |0(O)} = |a), show that at a later time, 

= e~ iu,t ^ 2 \ae~ iut ) 

Interpret this result. 


(h) Show that, as a function of time, (x) and (p) follow the classical trajectory 
of the harmonic oscillator, hence the name quasi-classical state. 


(i) Write the wave function as a function of time, ip a (x,t). Sketch the time 
evolving probability density. 


(j) Show that in the classical limit 


, AN 

lim —— -» 0 

|a|->oo (N) 


(k) Show that the probability distribution in n is Poissonian, with appropriate 
parameters. 

(l) Use a rough time-energy uncertainty principle(A£'At > h ) , to find an un¬ 
certainty principle between the number and phase of a quantum oscillator. 
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8.15.23. Neutrino Oscillations 


It is generally recognized that there are at least three different kinds of neutrinos. 
They can be distinguished by the reactions in which the neutrinos are created 
or absorbed. Let us call these three types of neutrino v e , v„ and v T . It has been 
speculated that each of these neutrinos has a small but finite rest mass, possibly 
different for each type. Let us suppose, for this exam question, that there is 
a small perturbing interaction between these neutrino types, in the absence of 
which all three types of neutrinos have the same nonzero rest mass Mo- The 
Hamiltonian of the system can be written as 


H = H 0 + Hi 


where 


and 



f 

M 0 

0 

0 \ 



H 0 = 


0 

Mo 

0 

- 

♦ no interactions present 


K 

0 

0 

Mq ) 





0 

hu>i 

hu\ 

\ 


k = 


huj\ 

0 

hui 


-*• effect of interactions 



huj\ 

huii 

0 

i 



where we have used the basis 


k) = |i> , M = l 2 ) , K> = |3) 


(a) First assume that uq = 0, i.e., no interactions. What is the time develop¬ 
ment operator? Discuss what happens if the neutrino initially was in the 
state 



( 1 ^ 


( 0 \ 


( 0 

IV’(O)) = k> = 

o o 

_ 

or k(0)> = k) = 

1 

l 0 ) 

or lk(o)> = k) = 

0 

11 


What is happening physically in this case? 

(b) Now assume that uq t 0, i.e., interactions are present. Also assume that 
at t = 0 the neutrino is in the state 


IV’(O)) = k) 


/ 1 \ 


S 


What is the probability as a function of time, that the neutrino will be in 
each of the other two states? 


(c) An experiment to detect the neutrino oscillations is being performed. The 
flight path of the neutrinos is 2000 meters. Their energy is 100 GeV. The 
sensitivity of the experiment is such that the presence of 1% of neutrinos 
different from those present at the start of the flight can be measured with 
confidence. Let Mo = 20 eV. What is the smallest value of hui that can be 
detected? How does this depend on Mq? Don’t ignore special relativity. 
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8.15.24. Generating Function 


Use the generating function for Hermite polynomials 


g 2a -t-t 2 


oo j.n 

Z Hn{x) — 
n=0 


to work out the matrix elements of x in the position representation, that is, 
compute 

oo 

i x )nn' = f V’ * n {x)xij) n '{x)dx 

— OO 

where 

ip n ( x) = N n H n (ax)e~i a x 


and 



(xf 


8.15.25. Given the wave function 


A particle of mass m moves in one dimension under the influence of a potential 
V(x). Suppose it is in an energy eigenstate 


tjj(x) = 



1/4 

exp (- 7 2 £ 2 / 2 ) 


with energy E = h 2 j 2 /2m. 

(a) Find the mean position of the particle. 

(b) Find the mean momentum of the particle. 


(c) Find V(x). 

(d) Find the probability P(p)dp that the particle’s momentum is between p 
and p + dp. 


8.15.26. What is the oscillator doing? 

Consider a one dimensional simple harmonic oscillator. Use the number basis 
to do the following algebraically: 

(a) Construct a linear combination of |0) and |1) such that (x) is as large as 
possible. 

(b) Suppose the oscillator is in the state constructed in (a) at t = 0. What is 
the state vector for t > 0? Evaluate the expectation value (x) as a function 
of time for t > 0 using (i)the Schrodinger picture and (ii) the Heisenberg 
picture. 

(c) Evaluate l(Ax) 2 \ as a function of time using either picture. 
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8.15.27. Coupled oscillators 

Two identical harmonic oscillators in one dimension each have a mass m and 
frequency uj. Let the two oscillators be coupled by an interaction term Cx 1 X 2 
where C is a constant and x\ and X 2 are the coordinates of the two oscillators. 
Find the exact energy spectrum of eigenvalues for this coupled system. 


8.15.28. Interesting operators .... 

The operator c is defined by the following relations: 

c 2 = 0 , cc + + c + c= {c,c + } = I 


(a) Show that 

1. N = c + c is Hermitian 

2. N 2 =N 

3. The eigenvalues of N are 0 and 1 (eigenstates |0) and |1}) 

4- c + |0> = 11) , c|0) = 0 

(b) Consider the Hamiltonian 

H = hujo(c + c + 1/2) 

Denoting the eigenstates of H by | n), show that the only nonvanishing 
states are the states |0) and |1) defined in (a). 

(c) Can you think of any physical situation that might be described by these 
new operators? 


8.15.29. What is the state? 

A particle of mass m in a one dimensional harmonic oscillator potential is in a 
state for which a measurement of the energy yields the values ftcc/2 or 3huj/2, 
each with a probability of one-half. The average value of the momentum ( p x ) 
at time t = 0 is (rmoh/ 2) 1 / 2 . This information specifies the state of the particle 
completely. What is this state and what is (p x ) at time t? 


8.15.30. Things about a particle in a box 


A particle of mass m moves in a one-dimensional box Infinite well) of length £ 
with the potential 


V(x) = 


00 x < 0 
0 0 < x < l 

00 x > t 
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At t = 0, the wave function of this particle is known to have the form 


f(x,0)A' JWPx{t - X) 0<x<e 

^0 otherwise 

(a) Write this wave function as a linear combination of energy eigenfunctions 



(b) What is the probability of measuring E n at t = 0? 

(c) What is ip(x,t > 0)? 


8.15.31. Handling arbitrary barriers. 

Electrons in a metal are bound by a potential that may be approximated by a 
finite square well. Electrons fill up the energy levels of this well up to an energy 
called the Fermi energy as shown in the figure below: 



E, 


Figure 8.29: Finite Square Well 

The difference between the Fermi energy and the top of the well is the work 
function W of the metal. Photons with energies exceeding the work function 
can eject electrons from the metal - this is the so-called photoelectric effect. 

Another way to pull out electrons is through application of an external uniform 
electric field £, which alters the potential energy as shown in the figure below: 

"V 

w 

_L 


Figure 8.30: Finite Square Well + Electric Field 
?By approximating (see notes below) the linear part of the function by a series 
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of square barriers, show that the transmission coefficient for electrons at the 
Fermi energy is given by 


T 


exp 


-Ay/2mW 3 ' 2 \ 
3e |e| h ) 


How would you expect this field- or cold-emission current to vary with the ap¬ 
plied voltage? As part of your problem solution explain the method. 


This calculation also plays a role in the derivation of the current-voltage char¬ 
acteristic of a Schottky diode in semiconductor physics. 


Approximating an Arbitrary Barrier 


For a rectangular barrier of width a and height Vq, we found the transmission 
coefficient 

1 2 _ f-ir 2 m 2 _ 2 m 

V 0 2 sinh 2 -ya ’ 7 ( 0 E ) E 

1 AE(V 0 -E) 

A useful limiting case occurs for 7 a» 1. In this case 

_ p~ P "/a 

sinh 70 = -- 


7a»l 2 


so that 


T = 


I 4/t'7 \ 

V 7 2 + fc 2 j 


-2-ya 


l + (^f) sinh 2 7 a 7a>>1 ' < 

Now if we evaluate the natural log of the transmission coefficient we find 

. 2 


InT In ( '^' , 2 ) - 27 a - 27 a 

7<2»1 \ ) 7a»l 


'■7' 

where we have dropped the logarithm relative to 7 a since In (almost anything) 
is not very large. This corresponds to only including the exponential term. 


We can now use this result to calculate the probability of transmission through 
a non-square barrier, such as that shown in the figure below: 

v» 


X 



Figure 8.31: Arbitrary Barrier Potential 
When we only include the exponential term, the probability of transmission 


675 



through an arbitrary barrier, as above, is just the product of the individual 
transmission coefficients of a succession of rectangular barrier as shown above. 
Thus, if the barrier is sufficiently smooth so that we can approximate it by a 
series of rectangular barriers (each of width Ax) that are not too thin for the 
condition 7 a » 1 to hold, then for the barrier as a whole 

In T rs In f] Ti = )T In T) = -2 E 7, Ax 

i i i 


If we now assume that we can approximate this last term by an integral, we find 


T * exp I -2 




1 exp 


-2 '~7T\/V{x) ~ Edx 
n 1 



where the integration is over the region for which the square root is real. 


You may have a somewhat uneasy feeling about this crude derivation. Clearly, 
the approximations made break down near the turning points, where E = V(x). 
Nevertheless, a more detailed treatment shows that it works amazingly well. 


8.15.32. Deuteron model 

Consider the motion of a particle of mass m = 0.8 x 10~ 24 gm in the well shown 
in the figure below: 


4 


00 


V(x) 

0 --- 

-E • -- X 

-v 0 - 

x = 0 x= a 

Figure 8.32: Deuteron Model 

The size of the well (range of the potential) is a = 1.4 x 10 ~ 13 cm. If the binding 
energy of the system is 2.2 MeV, find the depth of the potential Vo in MeV. 
This is a model of the deuteron in one dimension. 

8.15.33. Use Matrix Methods 

A one-dimensional potential barrier is shown in the figure below. 

Define and calculate the transmission probability for a particle of mass m and 
energy E (Vi < E < Vo) incident on the barrier from the left. If you let Vi -*■ 0 
and a 2a, then you can compare your answer to other textbook results. De¬ 
velop matrix methods (as in the text) to solve the boundary condition equations. 
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1 

\ 

o 


0 a x 

Figure 8.33: A Potential Barrier 


8.15.34. Finite Square Well Encore 

Consider the symmetric finite square well of depth Vo and width a. 

(a) Let kg = \j2mVolh 2 . Sketch the bound states for the following choices of 
koa/2. 

(i)^f = l , (ii)*f = 1.6 , (in)*f = 5 

(b) Show that no matter how shallow the well, there is at least one bound 
state of this potential. Describe it. 

(c) Let us re-derive the bound state energy for the delta function well directly 
from the limit of the the finite potential well. Use the graphical solution 
discussed in the text. Take the limit as a -*■ 0, Vo -»■ oo, but a Vo -*■ 
Uq (constant) and show that the binding energy is Et = mU^!2h 2 . 

(d) Consider now the half-infinite well, half-finite potential well as shown be¬ 
low. 


T 


V„ 


_ T 

x-0 x-L 


► 


Figure 8.34: Half-Infinite, Half-Finite Well 


Without doing any calculation, show that there are no bound states unless 
koL > 7 t/2 . HINT: think about erecting an infinite wall down the center 
of a symmetric finite well of width a = 2 L. Also, think about parity. 


(e) 


Show that in general, the binding energy eigenvalues satisfy the eigenvalue 
equation 

k= -k cot kL 


where 


k = 


2 mE), 


h 2 


and 


k 2 + k 2 = k?. 
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8.15.35. Half-Infinite Half-Finite Square Well Encore 

Consider the unbound case (E > Vo) eigenstate of the potential below. 

A 

VouCx) - -Ae lla '‘* = Ae lkj: 

- 1 -► < - 

E 

• 


Figure 8.35: Half-Infinite, Half-Finite Well Again 


Unlike the potentials with finite wall, the scattering in this case has only one 
output channel - reflection. If we send in a plane wave towards the potential, 
- Ae~ lkx , where the particle has energy E = ( hk) 2 /2m , the reflected 
wave will emerge from the potential with a phase shift, il>out(x) = Ae lkx+ ^, 

(a) Show that the reflected wave is phase shifted by 


where 


</> = 2 tan 1 ^ — tan qL j - 2kL 


q 2 = k 2 + k 2 


h 2 k 2 

—A - v 0 
2m 


(b) Plot the function of <j> as a function of k^L for fixed energy. Comment on 
your plot. 

(c) The phase shifted reflected wave is equivalent to that which would arise 
from a hard wall, but moved a distance L' from the origin. 


V 


VU*)— ipJx) = Ae~ 


Figure 8.36: Shifted Wall 

What is the effective L’ as a function of the phase shift <j) induced by our 
semi-finite well? What is the maximum value of L'l Can L' be negative? 
From your plot in (b), sketch L' as a function of koL, for fixed energy. 
Comment on your plot. 
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8.15.36. Nuclear a Decay 

Nuclear alpha -decays ( A , Z ) ->■ (A - 2, Z - 2) + a have lifetimes ranging from 
nanoseconds (or shorter) to millions of years (or longer). This enormous range 
was understood by George Gamov by the exponential sensitivity to underlying 
parameters in tunneling phenomena. Consider a = 4 He as a point particle in 
the potential given schematically in the figure below. 



Figure 8.37: Nuclear Potential Model 


The potential barrier is due to the Coulomb potential 2 (Z - 2)e 2 /r. The prob¬ 
ability of tunneling is proportional to the so-called Gamov’s transmission coef¬ 
ficients obtained in Problem 8.31 


T = exp 


\ f \/2m{V(x) - E ) dx 
h Ja 


where a and b are the classical turning points (where E = ^(a?)) Work out 
numerically T for the following parameters: Z - 92 (Uranium), size of nucleus 
a = 5 fm and the kinetic energy of the a particle IMeV, 3 MeV, 10-MeU, 
30 MeV. 


8.15.37. One Particle, Two Boxes 

Consider two boxes in 1-dimension of width a, with infinitely high walls, sepa¬ 
rated by a distance L = 2 a. We define the box by the potential energy function 
sketched below. 
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x - -i/2 x -0 X-U2 

Figure 8.38: Two Boxes 


A particle experiences this potential and its state is described by a wave function. 
The energy eigenfunctions are doubly degenerate, , <j>n ^ \ n = 1,2,3,4,.... j so 
that 


E (+) = #(-) 


9 7T it 

n -- 

2ma z 


where = u n (x ± L/2) with 


«n(l)= 


V / 2^cos(^) 

V^sinC^) 

0 


n = 1,3,5,.... 
n - 2,4,6,.... 


-a/2 < x < a/2 
-a/2 < x < a/2 
|m| > a/2 


Suppose at time t = 0 the wave function is 

+ ^</’ 2 _) ( ;e ) + -4<^ +) (z) 


At this time, answer parts (a) - (d) 

(a) What is the probability of finding the particle in the state </>J + ' ) (a:)? 

(b) What is the probability of finding the particle with energy 7r 2 ft 2 /2ma 2 ? 

(c) CLAIM: At t = 0 there is a 50-50 chance for finding the particle in either 
box. Justify this claim. 

(d) What is the state at a later time assuming no measurements are done? 


Now let us generalize. Suppose we have an arbitrary wave function at 
t = 0, ip(x, 0), that satisfies all the boundary conditions. 

(e) Show that, in general, the probability to find the particle in the left box 
does not change with time. Explain why this makes sense physically. 


Switch gears again . 

(f) Show that the state c l > , ! (x) = c\(j)n\x) + C 2 (f>^\x) (where Ci and C 2 are 
arbitrary complex numbers) is a stationary state. 
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(g) Sketch the probability density in x. What is the mean value ( x }? How 
does this change with time? 

(h) Show that the momentum space wave function is 


'ip(p) = n/2cos ( pL/2h)ui(p ) 


where 



is the momentum-space wave function of ui(x). 

(i) Without calculation, what is the mean value (p)? How does this change 
with time? 

(j) Suppose the potential energy was somehow turned off (don’t ask me how, 
just imagine it was done) so the particle is now free. 

Without doing any calculation, sketch how you expect the position-space 
wave function to evolve at later times, showing all important features. 
Please explain your sketch. 


8.15.38. A half-infinite/half-leaky box 


Consider a one dimensional potential 




r=0 


x=a 


Figure 8.39: Infinite Wall + Delta Function 
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(a) Show that the stationary states with energy E can be written 


where 


u(x) 


^ sin (fca^(fc)) sin(fcx) 
sin(fca) v ' 

A sin (kx + (f>(k)) 


x < 0 
0 < x < a 
x > a 


k = 



4>(k) = tan 


fctan (ha) 
k - 70 tan ( ka ) 


, 7o = 


2mUo 

h? 


What is the nature of these states - bound or unbound? 


(b) Show that the limits 70 -*■ 0 and 70 -»■ 00 give reasonable solutions. 

(c) Sketch the energy eigenfunction when ka = 7 r. Explain this solution. 

(d) Sketch the energy eigenfunction when ka = 7t/2. How does the probability 
to find the particle in the region 0 < x < a compare with that found in part 
(c)? Comment. 

(e) In a scattering scenario, we imagine sending in an incident plane wave 
which is reflected with unit probability, but phase shifted according to the 
conventions shown in the figure below: 



A 


ifcx 


-e 


ikx k 1 


r=0 


r=rt 


Figure 8.40: Scattering Scenario 

Show that the phase shift of the scattered wave is 5(k) = 2 (f>{k). 

There exist mathematical conditions such that the so-called S-matrix ele¬ 
ment e i<5 ( fc ) blows up. For these solutions is k real, imaginary, or complex? 
Comment. 

8.15.39. Neutrino Oscillations Redux 

Read the article T. Araki et al, ’’Measurement of Neutrino Oscillations with Kam 
LAND: Evidence of Spectral Distortion,” Ph.ys. Rev. Lett. 94, 081801 (2005), 
which shows the neutrino oscillation, a quantum phenomenon demonstrated at 
the largest distance scale yet (about 180 km). 
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(a) The Hamiltonian for an ultrarelativistic particle is approximated by 


H = \/p 2 c 2 + m 2 c 4 « pc + 


2^3 


me 

~2p 


for p=|p|. Suppose in a basis of two states, m 2 is given as a 2 x 2 matrix 
m 2 =m^ + Arn2 ^ COs(20) Sin(20) ' 


0 2 \ sin (26) cos (2 6) 

Write down the eigenstates of m 2 . 

(b) Calculate the probability for the state 

W = (J) 

to be still found in the same state after time interval t for definite momen¬ 
tum p. 

(c) Using the data shown in Fig. 3 of the article, estimate approximately 
values of Am 2 and sin 2 26. 


8.15.40. Is it in the ground state? 

An infinitely deep one-dimensional potential well runs fro x = 0 to x = a. The 
normalized energy eigenstates are 

, , 2 . . mix. 

u n \x) = \ -sm( - ) , n = 1,2,3,. 


A particle is placed in the left-hand half of the well so that its wavefunction is 
ij) = constant for x < a/2. If the energy of the particle is now measured, what is 
the probability of finding it in the ground state? 

8.15.41. Some Thoughts on T-Violation 

Any Hamiltonian can be recast to the form 


(E 1 0 

0 -E/2 


0 \ 
0 


u + 


EJ 


H = U\ 

\ 0 0 

where U is a general n x n unitary matrix. 

(a) Show that the time evolution operator is given by 


-iHt/h _ 


= U 


Ig-iExt/h 

0 

\ 0 


0 

-iE^tjh 


o \ 
0 


-iE n t/h 


u + 
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(b) For a two-state problem, the most general unitary matrix is 

rj - i6>/cos0e l< ^ -sinfle^A 
“ e ^sinfle"^ cos 0e“^ J 

Work out the probabilities P(1 -*• 2) and P(2 -*■ 1) over time interval t 
and verify that they are the same despite the the apparent T-violation 
due to complex phases. NOTE: This is the same problem as the neutrino 
oscillation (problem 8.39) if you set E % - rj pc+ m 2 ° and set 

all phases to zero. 

(c) For a three-state problem, however, the time-reversal invariance can be 
broken. Calculate the difference P( 1 -»■ 2) - P(2 -*■ 1) for the following 
form of the unitary matrix 

/I 0 0 W C13 0 s 13 e~ l5 \( C 12 S 12 0\ 

U = I 0 C23 S23 I 0 10 -Si2 Ci2 0 

\0 -S 23 C 23 )\-s 13 e zS 0 C 13 0 0 1, 

where five unimportant phases have been dropped. The notation is S 12 = 
sin 0 i 2 , c 2 3 = cos 023 , etc. 

(d) For CP-conjugate states (e.g.., anti-neutrinos (p) vs neutrinos (u), the Hamil¬ 

tonian is given by substituting U* in place of U. Show that the probabili¬ 
ties P(1 2) and P(1 -*■ 2) can differ (CP violation) yet CPT is respected, 

ie., P(1 -> 2) = P(2 ^ 1). 

8.15.42. Kronig-Penney Model 

Consider a periodic repulsive potential of the form 

oo 

V= ^ A S(x-na) 

n =—oo 

with A > 0. The general solution for -a < x < 0 is given by 

1>( x) = Ae iKX + Be~ iKX 

with k = \f2mE/h. Using Bloch’s theorem, the wave function for the next 
period 0 < x < a is given by 

iP(x) = e ika ( Ae iK(x ~ a) + Be ~ iK(x - a) ) 
for \k\ < n/a. Answer the following questions. 

(a) Write down the continuity condition for the wave function and the required 
discontinuity for its derivative at x = 0. Show that the phase e lka under 
the discrete translation x -» x + a is given by n as 



Here and below, d = h 2 /m\. 
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(b) Take the limit of zero potential d -* oo and show that there are no gaps 
between the bands as expected for a free particle. 

(c) When the potential is weak but finite (lartge d) show analytically that 
there appear gaps between the bands at k = ±7r/a. 

(d) Plot the relationship between k and k for a weak potential (d = 3 a) and a 
strong potential (d = a/3) (both solutions together). 

(e) You always find two values of k at the same energy (or k). What discrete 
symmetry guarantees this degeneracy? 


8.15.43. Operator Moments and Uncertainty 

Consider an observable Oa for a finite-dimensional quantum system with spec¬ 
tral decomposition 

o A = Z x * p > 

i 

(a) Show that the exponential operator Ea = exp(OA ) has spectral decompo¬ 
sition 

E A = T^ 

i 

Do this by inserting the spectral decomposition of Oa into the power series 
expansion of the exponential. 

(b) Prove that for any state |4 'a) such that A Oa = 0, we automatically have 
A E a = 0. 


8.15.44. Uncertainty and Dynamics 


Consider the observable 


and the initial state 



I'MO)) = (J) 


(a) Compute the uncertainty A Ox = 0 with respect to the initial state |\IU(0)}. 


(b) 


Now let the state evolve according to the Schrodinger equation, with 
Hamiltonian operator 



Compute the uncertainty AOx = 0 as a function of t. 
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(c) Repeat part (b) but replace O x with the observable 



That is, compute the uncertainty A Oz as a function of t assuming evolu¬ 
tion according to the Schrodinger equation with the Hamiltonian above. 

(d) Show that your answers to parts (b) and (c) always respect the Heisenberg 
Uncertainty Relation 


AO x AO z > -\{[0 Xl Oz])\ 

Are there any times t at which the Heisenberg Uncertainty Relation is 
satisfied with equality? 
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Chapter 9 

Angular Momentum; 2- and 3-Dimensions 


9.1. Angular Momentum Eigenvalues and Eigen¬ 
vectors 


In an Chapter 6, we derived the commutation relations that define angular 
momentum operators 


Jj\ ~ 'ih'E'ijkJk — ih &ijkJk 
k 


(9.1) 


where 


+ 1 if ijk = even permutation of 123 
eijk = -j -1 if ijk = odd permutation of 123 (9-2) 

0 if any two indices are identical 

and the Einstein summation convention over repeated indices is understood if 
the summation sign is left out (unless an explicit override is given). 


In addition, these are all Hermitian operators, i.e., (Jj)t = jj = j). 


Since these three operators form a closed commutator algebra, we can solve for 
the eigenvectors and eigenvalues using only the commutators. 

The three operators J\, J 2 and J3 do not commute with each other and hence 
do not share a common set of eigenvectors (called a representation). 


However, there exists another operator that commutes with each of the angular 
momentum components separately. If we define 

J 2 = f J1 (9-3) 

2=1 

which is the square of the total angular momentum vector, we have 

[j 2 , Ji] =0 for i= 1,2,3 (9.4) 
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In addition, we have (J 2 )^ = J 2 , so that J 2 is Hermitian also. 

The commutation relations and the Hermitian property say that J 2 and any one 
of the components share a complete set of common eigenvectors. By conven¬ 
tion, we choose to use J 2 and J 3 as the two operators, whose eigenvalues (good 
quantum numbers) will characterize the set of common eigenvectors. 

As we shall see, J 2 is a so-called Casimir invariant operator that character¬ 
izes the representations(set of eigenvectors). In particular, the eigenvalue of J 2 
characterizes the representation and the eigenvalues of one of the components 
of the angular momentum (usually J 3 ) will characterize the eigenvectors within 
a representation. 


We define the eigenvector/eigenvalue relations by the equations 



J 2 |A m) = A h 2 |Ato) 

(9.5) 


J 3 |A?n) = mh |Ato) 

(9.6) 

where the appropriate factors of h that have been explicitly put into the relations 

make m 

and A dimensionless numbers. 


We now 

define some other operators and their associated commutators 

so that 

we can use them in our derivations. 



</± - J\ ± iJ-2 

(9.7) 


J_ = (J+) -*■ they are not Hermitian operators 

(9.8) 

We then have 



[j 2 ,j ± ] = [j 2 ,j 1 ]±i[j 2 ,j 2 ] = 0 
[>A> J±] = [<A) J\\ ± i J 2 ] 

(9.9) 


= ihj 2 T i (i.hJi) = ±hJ± 

(9.10) 

and 

[•/+■ J-] = [-A, -A] +i [-A, <A] - * [<A, ./ 2 ] - [-A 1 <A] 



= —2 i [ Ji, </ 2 ] = 2ft J 3 

(9.11) 

and 

J+J- = (<A + i J 2 ) (<A — iJi) 



= J 2 + J‘2 — i[<A,</ 2 ] = J 2 — J 2 + hJ 3 

(9.12) 

and 

jJ + = J 2 - J 3 - ftJ 3 

(9.13) 

Finally, 

we have 



f2 J+J-+J-J+ i 

J — Jq 

2 3 

(9.14) 
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9.1.1. Derivation of Eigenvalues 

Now the definitions (9.5) and (9.6) tell us that 

(Am| J 2 \Am) = Ah 2 (Am \ Am) = ^ (Ato| J 2 |Ato) (9.15) 

i 

Ah 2 (Am | Am) = ^ (Am| j) Ji |Am) = ^ (Am\ J, + Ji |Am) (9.16) 

i i 

Let us define the new vector |cq) = J;|Am). Remember that the norm of any 
vector is non-negative, i.e., (a|a) > 0. Therefore 

(Am | Am) > 0 and (an \ cq) > 0 (9-17) 

Now since (cq| = (Am| Jj we have 

Ah 2 (Am. | Am) = ^ (Am\ J\ Ji |Am) = ^ (at \ on) > 0 (9.18) 

i i 

or we have 

A > 0 or the eigenvalues of J 2 are greater than 0 (9.19) 

In fact, we can even say more than this using these equations. We have 

Ah 2 (Am | Am) = ^ (ctj | aj 

i 

= (ai I cn) + (a 2 | a 2 ) + (a 3 \ a 3 ) 

= (a\ | «i) + (a 2 | a 2 ) + (Am\ J 2 |A m) 

= (a\ | «i) + (a 2 | a 2 ) + m 2 h 2 (Am \ Am) > 0 

which says that 

Ah 2 (Am | Am) > m 2 h 2 (Am \ Am) A > m 2 (9.20) 

This says that for a fixed value of A (the eigenvalue of J 2 ), which characterizes 
the representation, there must be maximum and minimum values of m (the 
eigenvalue of J 3 ), which characterizes the eigenvectors within a representation. 

Now we have 

J 3 J+ \Am) = J 3 (J+ | Am)) 

= ( J+ J 3 + [ J 3 , J+]) |Am) = ( J+J 3 + hj+ ) |Am) 

= h(m + 1) J+ |Am) = h(m + 1)( J+ |A?n)) (9-21) 

which says that J+ |Am) is an eigenvector of J 3 with the raised eigenvalue h(m + 
1), i.e., J+ |A?n) oc |A,m+ 1) (remember the harmonic oscillator discussion). 

Since we already showed that for fixed A, there must be a maximum value of m, 
say m max , then it must be the case that for that particular m- value we have 

J + |Arn max ) = 0 (9.22) 
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If this were not true, then we would have 

J+ |Am max ) |A,m max + 1) (9.23) 

but this violates the statement that ?n max was was the maximum m-value. 


Using this result we find 

J-J+ |Am max ) = 0 = (-I " J 3 ~ hJz ) |Am max ) 

h (A - TO max — TO max ) |ATO max ) — 0 

A TO max — ^max — 0 ^ A — m max + TO max 

A — m max (?n max + 1) (9.24) 

It is convention to define 

to max = j and hence A = j(j + 1) (9.25) 

In the same way we can show 

J 3 J_ |Ato) = J 3 ( J_ |Am}) 

= (J_ J 3 + [j 3 , X]) |Am) = (J_ J 3 - frJ+) |Am) 

= h(m- 1) J_ |Ato) = h(m- 1)(J_ |Am)) (9.26) 

which says that J_ |Am) is an eigenvector of J 3 with the lowered eigenvalue 
h(m - 1), i.e., J_ |Am) oc |A, in - 1). 


If we let the minimum value of to be m m i n , then as before we must have 


J- |ATO m i n ) — 0 


(9.27) 


or m m i n is not the minimum value of to. This says that 

J+J- |ATO m i n ) = 0 = (T“ — ._/ 3 + |ATO m i n ) 
h (A - TO m j n + TO m j n ) |ATO m j n ) = 0 
A TO m j n + TO m j n — 0 ~* A — TO m j n “ TO m i n 
A — TO m i n (?7l m i n — 1) = j (j + 1) (9.28) 


which says that 

TO m i n = — j 


(9.29) 


We have thus shown that the pair of operators J 2 and J 3 have a common set of 
eigenvectors \jm) (we now use the labels j and to), where we have found that 


-j < m < j 


(9.30) 


and the allowed m-values change by steps of one, i.e., for a given j-value, the 
allowed m-values are 


- j, ~j + h -j + 2,. ,j - 2,j - l,j (9.31) 
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which implies that 


2 j = integer 


(9.32) 


or 

j = --— > 0 the allowed values (9.33) 


Thus, we have the allowed sets 

or representations of the angular momentum 

commutation relations given by 



3 = 0 

, TO = 0 


1 

1 1 


J ~ 2 

, m = — 

2 2 


3 = 1 

, TO = 1, 0, -1 


3 

3 1 

1 3 

J “ 2 

, m = - 

2 2 

2 ’ _ 2 


(9.34) 


and so on. 

For each value of j, there are 2 j + 1 allowed m-values in the eigenvalue spec- 
trum(representation) and 


J 2 \jm) = h 2 j(j + 1) | jm) (9.35) 

J 3 | jm) = mh | jm) (9.36) 

Before proceeding, we need a few more relations. We found earlier that 

J+ \jm) = C+ \j,m+ 1) = |a+) (9.37) 

J- I jm) = C. \ j, to - 1) = |a_) (9.38) 

and from these we have 

(a+| = C* (j,m + 1| (9.39) 

(a_| = C* (j,m- 1 | (9.40) 


We can then say that 

(a+ | a+) = \C+\ 2 (j, to + 1 | j,m+ 1} = \C+\ 2 
= ((jm\ (J + ) + )(J + \jm)) 

= (jm| J_ J+ | jm) 

= (jm| (J 2 - J 3 - ftJ 3 ) |jm) 

= (jm| h 2 (j(j + 1) - to 2 - m) \jm) 

= h 2 (j(j + 1) - m 2 - to) (jm | jm) 

= h 2 (j(j + 1) - to 2 - to) (9.41) 
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or 

C + = h^j(j + l) -m(m+ 1) (9.42) 

and similarly 

{a- | a-) = |C_| 2 (j,m-l \ j,m- 1) = |C_| 2 
= (0'm|(J_) + )(J_|jm)) 

= 0'm| J + J-|im) 

= (jm I (^ 2 - ^3 + ^-4) |jm) 

= (jm| h 2 (j(j + 1) - m 2 + m) |jm) 

= h 2 (j(j + 1) - m 2 + m ) (jm | jm) 

= h 2 (j(j + l) - m 2 + m) (9.43) 

or 

C_ = h^j(j + l) -m(m-l) (9.44) 

Therefore, we have the very important relations for the raising/lowering or lad¬ 
der operators 

J± | jm) = hyfjij + 1) - m(m ± 1) | j, mil) 

= h VU ± m + l)(j t to) |j, m ± 1) (9.45) 

9.2. Transformations and Generators; Spherical Har¬ 
monics 

There are many vector operators and vector component operators in the fol¬ 
lowing discussions. To avoid confusing notation, we will adopt the following 
conventions: 

A op = a vector operator 
A j = a vector component operator 

B = any non-vector operator or it might be a unit vector (context will decide) 
q = an ordinary vector 

As we showed earlier, the angular momentum operators are the generators of 
rotations. The unitary transformation operator for a rotation through an angle 
6 about an axis along the direction specified by the unit vector n is given by 

Un{0) = e~^ 6A '^ op (9.46) 

where J op is the angular momentum operator. 

What does the J 3 operator look like in the position representation? 
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We will quickly get an idea of the answer and then step back and do things in 
more detail. 


Suppose that we have a position representation wave function ip{f ). For a 
rotation about the 3-axis, we showed earlier in Chapter 6 that for an infinitesimal 
angle e 

f7 3 (e)^>(xi,x 2 , x 3 ) = ip{x 1 cose + X 2 sine, -x\ sine + X 2 cose, X 3 ) (9.47) 


In general, for infinitesimal shifts(a in this case) in the coordinates, we have, to 
lowest order in the infinitesimals, for a function of one variable 


f(x + a) = f(x) + a^~ 
ox 


(9.48) 


Extending this to two variables, we have, to first order in infinitesimals a and b, 

(9.49) 

uy 

Therefore, for an infinitesimal angle of rotation e, 


f(x + a,y + b) = /(x, y) + a ^ + b^f 

ox ay 


f7 3 (e)^>(x i,X2,X3) = ip(xi cose + X2 sine, -X\ sine + X2 cose, x 3 ) 

dip dip 

= ip{Xi,X 2 , X 3 ) + eX 2 —- £X !-— 

OX\ OX2 

= ^1 + e ^x 2 ^— * 19 x 2 )) ^’C 2 ' 1 ’ 2 ' 2 ’ 2 ' 3 ) 

But we also have (to first order) 

^ % /■, 

t/ 3 (e)^(xi,x 2 ,x 3 ) = (1- -eJ 3 )V’(xi,x 2 ,x 3 ) 

a 

Putting these two equations together we have 

/s ( d d \ 

J 3 = -iqx 2 7^ ~Xi — ) = (Wp x (-*^V)) 3 

(x op X Pop) 3 — = L 3 


(9.50) 


(9.51) 


(9.52) 


where 


L op = orbital angular momentum operator 
f op = ( x,y,z ) = (xi,x 2 ,x 3 ) 

Pop = (Px,Py,Pz) = (Pl,P2,P3) 


Since L op is an angular momentum, it must have the commutation relations 


[h,, Lj j — iheijkLk 


(9.53) 
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where as we indicated above 


L\ = X 2 P 3 ~ X 3 P 2 

(9.54) 

Li = X 3 P 1 - X 1 P 3 

(9.55) 

L 3 = X3P2 ~ X 2 P 1 

(9.56) 

and 

f + _ T 
^ op ~ ^ op 

(9.57) 

Using 

[x.i,pj] = ihSij 

(9.58) 

we get 

[ L 1 , Xj j — lh£ijkXfc 

(9.59) 

and for n = a unit vector 


— ihY j n i £ijkX k 

i i 

(9.60) 

[h ■ L op , Xj\ = ih ( r op x fi) . 

\h ’ Lop, ^op] = \h ' L op) Xj^Cj 

j 

(9.61) 

= ih Y, {f op x h).e.j = ih ( f op x n) 
j 

(9.62) 

where we have used 

Ax B - 'y ) -i;y/,dy B 

ijk 

(9.63) 

Similarly, we get 

[fi • U op ,Pop] = ih {pop x n) 

(9.64) 

Now let us step back and consider rotations in 3-dimensional space and try to 
get a better physical understanding of what is happening. 

Consider the operator 

f'op = f op + a x f op 

where a = ordinary infinitesimal vector. 

(9.65) 

Now let |fo) = an eigenstate of f op with eigenvalue fo, i.e., 


f op \fo) = fo |n>) 

(9.66) 

We then have 

f'op |n>) = f'o |fo} = (f 0 + a x f 0 ) |f 0 ) 

(9.67) 


which says that l?^} is also an eigenstate of f' op (different eigenvalue). 
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Now, for simplicity, let a = ote 3 , which says that 

a x f 0 = ae 3 x (x 0 iei + Xo2e2 + #0363) = ax 0 ie 2 - axo2ei ( 9 . 68 ) 

f 0 + a x f 0 = (x 0 i - ax 0 2)ei + (x 0 2 + ax 0 i)e 2 + x 0 3e 3 ( 9 . 69 ) 

This last expression is the vector we get if we rotate fo about a = a/|o:| (I 3 in 
this case) by an infinitesimal angle a = |a|. This result generalizes for a in any 
direction. 


Since the eigenvalues of f' are those of f op rotated by |a| about a, we conclude 
that f' op is the position operator rotated by rotated by |a| about a. 

Alternatively, f' op is the position operator in a coordinate frame rotated by |a| 
about a (remember our earlier discussions about the active/passive views). 


To connect this to generators, unitary transformations, and angular momentum, 
we proceed as follows. We can rewrite f' op as 

t op ~ r op j * Lop, (9.70) 

which is equivalent (to first order in a) to 

f; p = e* a - £ -r op e-^ Z - (9.71) 


i.e., 

f Ot-‘ L op Cv .■ L 0 p 

' op - e ' opt 

— (1 + —a ■ L op )v op [l — — ol • L op ) + 0(cx ) 

— fop ' Lop, r, op] ) 

Earlier, however, we showed that, if the state vector transforms as 

W) = u\>!>) 

then the operators transform as 

u~ l ou 

which then implies that the rotation operator is 

17(3) = 


(9.72) 


(9.73) 


(9.74) 


(9.75) 


as we expect. 

This result derived for infinitesimal rotation angles holds for finite rotation an¬ 
gles also. 
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Let us look at the effect on state vectors 

f' op \fo) = r 0 |r 0 ) = e* aLop r op e~* a ' Lop |f 0 ) (9.76) 

r op e-* & ' L ° p |f 0 ) = r' 0 e~* &L ° p |f 0 ) (9.77) 

or 

e -* &L °p\r 0 ) (9.78) 

is an eigenstate of f op with eigenvalue f q or 

|f') = e-* a 'Hn,} and (r q| = (f 0 | e* & ' lop (9.79) 

and thus L op is the generator of ordinary rotations in 3-dimensional space. 


For wave functions we have ip(ro) = (ro IV’)- Using this result, the wave function 
transforms as 

^o) = {r o | ip) = (f 0 | e* a ' L ° p | 0 ) 

= (fo | ip 1 ) = ip'(ro ) = wave function at rotated point f ' 0 (9.80) 

Now, we have seen that 

Lop — r op x p op — -ihr op x V (9.81) 

This can easily be evaluated in different coordinate systems. 


Cartesian Coordinates 


ijk 


■ih ^2 


Lo - 



d „ 


dxk 

d 

d 

dx 3 

x 3 T. 
dX2 

d 

d 

dx\ 

X 1 T. 
dx 3 


L 3 = 


-ih ^ 


x i ^- x 2 „ 

0X2 ox 


1 ) 


as we saw at the beginning of this discussion. 


(9.82) 

(9.83) 

(9.84) 


Spherical-Polar Coordinates 


We have 


where 


_ ,9 ,19 

r = re r and V = e r — + eg- — 
or r 06 


+ 


_L_ d_ 

r sin 9 dip 


e r = sin 9 cos ipe 1 + sin 9 sin ipe 2 + cos 9e 3 
eg = cos 9 cos ipe 1 + cos 9 sin ipe 2 + sin s9e 3 
e v = - sin 0ei + cos 9e 2 


(9.85) 


(9.86) 

(9.87) 

(9.88) 
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which gives 


Lop — iL 


d 


1 d 


e v o/i e Q ■ 


dO sin 0 dip i 


and 


» ~ „ - . d 

L 3 — L z = c z ' L op — 


f 2 - T . T - -h 2 

^op ~ ■ LJ op ^op — 11 


dip 

1 d 
sin 0 d 6 


(M) + sS 


1 d 2 
0 dp 2 


Similarly, we have 


L\ — L x — ih 


1*2 — Ly — ih 


d n d 
sin p— + cot 0 cos p —— . 
dO dp J 

d „ . d 

cos p— -cot 0 smp— . 

dO dpi 


and 


V 2 = -— r - L ° p 
r dr 2 h 2 r 2 


(9.89) 

(9.90) 

(9.91) 

(9.92) 

(9.93) 

(9.94) 


9.2.1. Eigenfunctions; Eigenvalues; Position Representation 


Since 

— ihsij^L^<xnd ^Z/ Q p, Lj J — 0 (9.95) 

the derivation of the eigenvalues and eigenvectors follows our earlier work, i.e., 
the equations 


L 2 p | tm) = h 2 t{t + 1) | tm) and L 3 \tm) = hm \tm) 
L± = L x i i/Ly 


imply 

£= integer >0 

2 

and for a given value of £, m takes on the 2 £ + 1 values 


m = -£,-£+ !,-£ + 2 ,. ,£- 2 ,£-l,£ 


(9.96) 

(9.97) 

(9.98) 

(9.99) 


If we move back into 3-dimensional space and define 

Y( m (0 , p) - {Op | £m) = spherical harmonic 
then we have the defining equations for the Y( m {0,p ) given by 

Ml L 2 op IM = Ll p {Op | tm) = L 2 op Y em (0 , p) 

= h 2 t{t + 1) {Op | tm) - h 2 t{t + 1 )Ye m (0, p) 
{Op | L 3 | tm) = L 3 {Op | tm) = L 3 Y trn {0 , p) 

= hm {Op | tm) = hmY( m {0 , p) 


(9.100) 


(9.101) 

(9.102) 
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Before determining the functional form of the Y) m 's, we must step back and see 
if there are any restrictions that need to be imposed on the possible eigenvalues 
£ and m due to the fact that we are in a real 3-dimensional space. 

In general, eigenvalue restrictions come about only from the imposition of physi¬ 
cal boundary conditions. A standard boundary condition that is usually imposed 
is the following: 

In real 3-dimensional space, if we rotate the system (or the axes by 2ir, then we 
get back the same world. This means that the Y( m (6, p) should be single-valued 
under such rotations or that 


Y tm (6,<p + 2w) = Y tm (9,<p) (9.103) 

Now from the general rules we have developed, this gives 

(6, p + 2n\£m) = (dp\ e 2mLs/h \£m) = e 2 ™" {6, p \ Im) (9.104) 

or that single-valuedness of the wave function requires that 

(9, p + 2n\ Im) = (9p\ e 2niLs/h \£m) = e 27rim (0, p \ Im) (9.105) 

This says, that for orbital angular momentum in real 3-dimensional space, no 
1/2-integer values are allowed for m and hence that £ must be an integer. The 
allowed sets are: 


£= 0 , m = 0 

£= 1 , to = - 1 , 0,1 

£ = 2 , in = -2,-1,0,1,2 


and so on. 

It is important to note that we are imposing a much stronger condition than 
is necessary. In general, as we have stated several times, it is not the state 
vectors, operators or wave functions that have any physical meaning in quantum 
mechanics. The only quantities that have physical meaning are those directly 
related to measurable quantities, namely, the probabilities, and the expectation 
values. This single-valued condition on to is not needed for the single-valuedness 
of the expectation values, since the extra phase factors involving to will cancel 
out during the calculation of the expectation value. Experiment, however, says 
that £ is an integer only, so it seems that the strong condition is valid. We 
cannot prove that this is so, however. 

Let us now figure out the Ye m (9,p). We have 

L 3 Ye m (9,p) = --^-Y im (0,p) = hmY( m (9,p) (9.106) 

l G(f 
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which tells us the (^-dependence. 


Y em (e,<p) = e imv P em (0) 

Now we also must have, since t - maximum value of m 

(9<p\ L + \U) = 0 = L + (0<p I U) = L + Y u {9, ip) 
Using the expressions for L x ,L y ,L+ and L_ we have 


h 

— e 

i 


up 


.8 8 

'»r cot %j 


Y U (0,<P) = 0 


Using 


we get 


--^-Y u (0,ip) = £hYee(6,ip) 

i Otp 


JUfcMfl 


Pu(0 ) = 0 


P«(#) = (sin#/ 

Therefore, the final expression is 

Y u (e,<p) = A ern e u v(smey 


(9.107) 

(9.108) 

(9.109) 

(9.110) 

(9.111) 

(9.112) 

(9.113) 


Now that we have generated the topmost spherical harmonic. We can generate 
all the others for a given l using the lowering operators, i.e., 

L-Y im (0,ip) = 1) - m(m - l)Ye m -i(6,ip) (9.114) 


where 




„ d 


(9.115) 


In general, we choose the Ag m so that the Y( m (9,ip) are normalized, i.e., 


2ir 7T 

1 = J dn\Y em (9,ip)\ 2 = J dip I sin OdO | Yi m (0, ip)| 2 (9.116) 

o o 


Since the Ye m (9,<p ) are eigenfunctions of a Hermitian operators, they form a 
compete set which we can always make orthogonal so that we also assume that 
we have 

27T 7T 

J dcp J sin 0 d0Yt, m ,(0,p)Y£ m (0,ip) = 6£'£6 m ' m (9.117) 

0 0 

The algebra is complicated. The general result is 


P) 


(-i)* 



f d y 

4 -7T (l - m)\ (sin#) m 

\ d cos 9 / 


(sin#) 


it 


(9.118) 
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Some examples are: 


Yoo - 


\J~Yk 


Y\o = \ l — cos t 
47 r 


Y ll±1 = *J£e ±iv sind , F 2O = W-^-(3cos 2 0-1) 


Y 2 , ± i - T 



sin 9 cos 0 e ±JV , Y 2 ±2 = \ - 77 - sin 2 9e ±2lv 

V 3277 


Some Properties 

Y e ,- m (9,ip) = (~l) m Y e * m (6,ip) 

Under the parity operation 


-rorr-*r,0-»-7r-0,iy5-»<p + 7r 


This says that 


gimip girrup ^irmr 


(-1) 


imp 


sin 9 -* sin(7r - 9) -* sin 9 
cos 9 -*■ cos (77 - 9) - cos 9 


which imply that 
Therefore, 


(9.119) 

(9.120) 

(9.121) 

(9.122) 

(9.123) 


(9.124) 

(9.125) 


if £ is even, then we have an even parity state 
if £ is odd, then we have an odd parity state 

Since they form a complete set, any function of ( 9 , ip) can be expanded in terms 
of the Yi m {9,ip) (the Y( m (9,p) are a basis), i.e., we can write 

M<p)='Zftr n Y i , m (9,'p) (9.126) 

£,m 

where 

27T 7 r 

fim = f dp J sm6d9Y e * m ,(9,tp)f(6,ip) (9.127) 

0 0 

9.3. Spin 

indexSpin As we discussed earlier, a second kind of angular momentum exists 
in quantum mechanics. It is related to internal degrees of freedom of particles 
and is not related in any way to ordinary 3-dimensional space properties. 
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We designate this new angular momentum by spin and represent it by the 
operator S op , where since it is an angular momentum, its components must 
satisfy the commutators 

[S i ,S j ]=ihe ijk S k (9.128) 

where we have used the Einstein summation convention over repeated indices. 


The analysis for the eigenvalues and eigenvectors follows the same path as for 
earlier discussions. We have 

Sop |s,?n s ) = h 2 s(s + 1) |s, m s ) (9.129) 

S 3 |s, m s ) = hm s |s, m s ) (9.130) 

which, together with the commutators, gives the following results. 


For a given value of s, we have 2s + 1 m s -values 

m s = -s, -s + 1, -s + 2,., s - 2, s - 1, s (9.131) 

where 


integer 


> 0 


(9.132) 


There are no boundary conditions restricting the value of s, so we can have both 
integer and half-integer values. 


We now turn our attention to a most important special case and then generalize 
the details. 


9.3.1. Spin 1/2 

We define a new operator a op such that 

S op = ^ha op (9.133) 

where a op = (dy, 02 ,( 73 ) are called the Pauli spin operators. 

It is experimentally observed that if one measures the component of this spin 
angular momentum along any direction, one always obtains either ±h/ 2 . 

If we designate the state with spin = +h/2 or spin up in the n direction by the 
ket vectors |n t) or |n+), we then have 

S op ■ n\n f) = | n t) and S op • h\n 1 ) = | n I) (9.134) 

Any pair of eigenvectors \h t) or \h |) for a given direction n form a basis for 
the vector space associated with spin = 1/2 and we have 

(ii f | 0} = amplitude for finding spin ”up” along n if we are in the state | 0 ) 

(n 4 | 0 ) = amplitude for finding spin ’’down” along h if we are in the state | 0 ) 
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These two amplitudes exhaust all possible measurements for spin = 1/2 in the n 
direction and therefore completely specify the state | ip). That is what we mean 
physically when we say they form a basis set or representation. 


When we build the standard basis from these amplitudes, we choose it to be an 
eigenstate of the S 3 or S z operator, i.e., if we write 

\ip) = | | j j = a ^ - component vector (9.135) 

then the appropriate basis is 


M) | Q j 

M> = ( 1 ) 


spin up in z - direction 
spin down in z - direction 


Matrix Representations 

Using this basis, the matrix representation of 

a h. 

&z — 


(9.136) 


Now 


S 2 = 


cr z = 


<*t|S*|*t> (^\S z \zi) 
(z || S z \z f) (z\\S z \z\,) 

1 
0 


i) 


HOd 


h 

= 2 *' 


S± — S x i iSy 

d S + + S. J ~ S + - 

-»■ o x = - and b v = - 

2 v 2 i 


Therefore, using 


(z t| 'S'x t) (z tl^x \z I) 
(z -ll-S^lz t) (z \z ).) 


HC i) 


h 

~ 2°' z 


(!i) 


and in a similar way 


("0) 


(9.137) 

(9.138) 

(9.139) 


(9.140) 


(9.141) 
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Properties of the cr t 

From 


and 


[ — / fl£ i j Sp- 


Sop 2 h(J op 


we get the commutation relations 


(9.142) 


In addition, we have 

Oj<7j = iEijk&k i*j (9.143) 

cTi&j + CTjCJi = = 0 (called the anticommutator) (9.144) 

^ = 7=1 J J j (9.145) 

The fact that the spin = 1/2 operators anticommute is directly linked to the ex¬ 
istence of fermions as one can see when the relativistic equation for the electron 
is studied. 


Put all together, these relations give 

a.i&j = Sij + iSijkVk (9.146) 

or going back to the S) 

SiSj = ^-Sij + i£ijk^Sk (9.147) 

In the special case of spin = 1/2, we have 

Sip = + °i) = (I +1 + 1) 

3 h 2 , „ 1 

= —— I - h 2 s(s+ 1)1 with s = — (9.148) 

A very useful property that we will employ many time later on is (using (9.146)) 

(• £T 0 p) (/> * (T op) — Oji&ibjOj 

— fiij&ibj + i£ijkaibj<j k = o ., /r, + k^ibj o' i- 

= a-b + i(axb) -a 0 p (9.149) 
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Rotations in Spin Space 

We have said that 

S 0 p • n\n±) = ±—\h±) (9.150) 

What do the states |n±) look like in the \z±) basis? One way to find out is the 
direct approach. 


Let us choose a Cartesian basis for the unit vector (we will look at other choices 
afterwards) 


h=(n x ,n y ,n z ) , all real 


We then have 


So 


„ h 
■n = —<r 0 
2 

h 

= - I n, 
h 


■ n = —(n x a x + n y a y + n z a z ) 


0 1 
1 0 


n z 


n x + in 


y 


) + n »( ; o ) 

) 


+ n~ 


1 0 
0 -1 


)) 


n x - in 
-n z 


and 


This matrix equation gives two homogeneous equations for a and b 

(n z - l)a + ( n x - in y )b = 0 
(n x + in y )a - (n z + 1)6 = 0 


or 


a 

b 


n x my 
n z - 1 


n z + 1 


n x + m. 


v 


(9.151) 


(9.152) 


S op ■n\h+) =+-\h+) 


(9.153) 

h / n z n x - in y 

2 \ n x + in y -n z 

)(:)- 3(0 

(9.154) 

where we have represented 

/ 

\ 


|h+) = f 

i) 

(9.155) 


(9.156) 


The homogeneous equations have a non-trivial solution only if the determinant 
of the coefficients of a and b equals zero or 


(n z + 1 )(n z - 1) + (n x - in y )(n x + in y ) = 0 
We assume that the vector |h+) is normalized to one or 

M 2 + |6| 2 = 1 


(9.157) 

(9.158) 
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Putting all this together we get 


a = —- \/l + n z and b = —= \/l - n z 
\/2 v 2 




(9.159) 

(9.160) 


We can easily check this by letting |h+) = \z+) or n z = l 1 n x = n y = 0 which gives 


| 5 + > = 


( 5 ) 


as expected. In a similar manner 


_ 1 / -\Jl-n z \ 
11 x /2 V \/ 1 + n * / 


(9.161) 


(9.162) 


Note that the two vectors |h+) and |h-} are orthonormal as we expect. 


This calculation can also be carried out in other coordinate bases. For the 
spherical-polar basis we get 


h = (sin^cost^, sin0sint/3,cos0) 


S, 


op 


„ h 
■ n = — 
2 


cos 6 e lv sin 9 
e 1 ^ sin 9 - cos 9 


e 


) “ d I*-)' 


l „ ) 

{ -e^cosf ) 


|h+> = 

What can we say about operators in the spin = 1/2 vector space? 


(9.163) 

(9.164) 

(9.165) 


Any such operator B can be expressed as a linear combination of the four 
linearly independent matrices {/, a x , (J y , <5y) (they are, in fact, a basis for all 
2 x2 matrices) 


B = ao/+a x <j a ,+a y tt y +a z i7j = a^I+a- <r op = 


CLq “I" 3/2 9>x l&y 

&X Q >0 


(9.166) 


In particular, the density operator or the state operator W which is the same 
as any other operator can be written as 

W=^(I+d-a op ) (9.167) 

where the factor of 1/2 has been chosen so that we have the required property 

TrW = -TV/ + a{Trai = 1 since Trcr,; = 0 (9.168) 
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Since W = W^ we must also have all the cq real. What is the physical meaning 
of the vector a? 

Consider the following 

(a x ) = Tr(Wa x ) = -Tr((/ + a ^ x + a v^v + a ^z)^x) 

= ^ Tr(cr x + a x a 2 x + a y a y a x + a z a z a x ) 

= ^ Tr(a x + a x I - ia y a z + ia z a y ) 

= < jTr(I)=a x (9.169) 

or, in general, 

(&op) = Tr(Wa 0 p) = 3= polarization vector (9.170) 

Now the eigenvalues of W are equal to 

eigenvalues of a ■ a op (9.171) 

and from our earlier work the eigenvalues are a • a op = ±1. Therefore the eigen¬ 
values of IT are 

|(l±|a|) (9.172) 

But, as we showed earlier, all eigenvalues of W are > 0, which says that polar¬ 
ization vectors have a length |a| restricted to 0 < |a| < 1 . 

Pure states have |a| = 1 and this gives eigenvalues 1 and 0 for W, which corre¬ 
sponds to maximum polarization. 

Note that for a - ae 3 = e 3 , we have 

W=i(/+<x 3 ) = ( J °) = |*+M*+I (9.173) 

as it should for a pure state. 

An unpolarized state has |a| = 0 and this gives eigenvalues (1/2,1/2) for IT. 
This represents an isotropic state where (S)} = 0. 

In this case we have 

;)Xi 2+ >< 2+ iXl‘-><‘-i (9J74) 

as we expect for a nonpure state or a mixture. 
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Let us now connect all of this stuff to rotations in spin space. For an infinitesimal 
rotation through angle a = |a| about an axis along a = a/a, a unit vector to 
becomes 

n = m + 5xm (9.175) 

Following the same steps as earlier, this implies 

S op ■ n = S op ■ rh + S op ■ ( a x to) = S op ■ rh + SijkairiijSk (9.176) 


But we have 


which implies that 


E-ijkSk — . ^ [ Si , Sj ] 


Sop * ri — S op ■ to + . ^ [ S 7 , Sj j aim 2 

= S op -m + — [Sopvm, S op ■ a] 


(9.177) 


(9.178) 


Using the same approximations as we did earlier, we can see that this expression, 
to first order in a, is equivalent to 


S 0 p • h = e ^ Sop ' & S op ■ me * Sop ' 6 


(9.179) 


This result holds for finite rotation angles also. 


Now using this result, we have 

S op ■ n |^e _ ' R: ^ op ' Q |?h+)j = [e~' K ^° p ' a S op ■ mj \rh+) 

= ^ [ e -* Sop ' & |m+)j (9.180) 


This says that 

e ~h S °p' a 1 777 ,+ ^ = |h+) and similarly e~ T ‘ Sop ' a \rh-) = \n~) (9.181) 

The rotation that takes m -* n is not unique, however. We are free to rotate by 
an arbitrary amount about ft after rotating rh into h. This freedom corresponds 
to adding a phase factor. 

We say that the unitary operator 


„-iS 0 


(9.182) 


has the effect of rotating the eigenstate of S op ■ rh into the eigenstate of S op ■ ft- 
The operator performs rotations on the spin degrees of freedom. The equations 
are analogous to those for real rotations in space generated by L op . 
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Let us now work out a very useful identity. We can write 


g h S° p -<y. _ g 2 o’op'dt 


= E 

n =0 


( 2 ®°P ' 


Now we have 


Therefore 


(<j op - a) = a - a + i(a x a) • a op = a 2 = 


(9.183) 

(9.184) 


_*a . a * 1 / i, 1 { i, V 1 / i, ,V 3 

1 ! V 2 p j 2! V 2 p j 3! \ 2 p j 


= / 


' (n! + (ii! \ 

2! 4! I 


^OO * I 


'(f) (f) 


1! 3! 


+ .... 


= 1 cos - - *(cr op • a) sin — 


Consider an example. Let a -*■ -90° rotation about the :r-axis or 


7T » 

a -* —x 
2 


Now we have 


<M<7 x ) n = (~a x ) n a z 

which follows from the anticommutation relations. This implies that 


(9.185) 

(9.186) 

(9.187) 


Vzf(vx) = f(-ct x )<rz 

f ~&z)®z 


Using these relations we get 


= e 


, 7T . 7T 

= (cos — + io x sin — )cr z = i<J x <J z - o 


(Jz = e 2 
v 


(9.188) 


as expected for this particular rotation. 


Now the spin degrees of freedom are simply additional degrees of freedom for 
the system. The spin degrees of freedom are independent of the spatial degrees 
of freedom, however. This means that we can specify them independently or 
that S op commutes with all operators that depend on 3-dimensional space 

[S op ,f op ] = 0 , [S op ,Pop] = 0 , [S 0 p,L 0p \ = 0 (9.189) 

To specify completely the state of a spinning particle or a system with internal 
degrees of freedom of any kind we must know 

1 . the amplitudes for finding the particle at points in space 


708 



2 . the amplitudes for different spin orientations 

By convention, we choose z as the spin quantization direction for describing 
state vectors. The total state vector is then a direct-product state of the form 

| ip) = |external) <g> |internal) (9.190) 


and the amplitudes are 

(r, z+ | ip) = probability amplitude for finding the particle at 
f with spin up in the 2 direction 
(r, z— | ip) = probability amplitude for finding the particle at 
f with spin down in the z direction 

where 

(r, 2 + | Ip) = (?’ | Ipexternal) {z+ | Ipinternal) (9.191) 

and so on. 

The total probability density for finding a particle at f is then the sum of two 
terms 

\(f,z+ | ip )\ 2 + \(r,z- | 'ip)f (9.192) 

which represents a sum over all the ways of doing it. 

The total angular momentum of such a spinning particle is the sum of its orbital 
and spin angular momenta 

J op = L op + S op (9.193) 

J op is now the generator of rotations in 3-dimensional space and in spin space 
or it affects both external and internal degrees of freedom. 

If we operate with e - ia ' J <>vl h on the basis state |fo,m+), where the particle is 
definitely at fo with spin up (+h/ 2 ) in the m direction, then we get a new state 
\r' 0 ,h+) where 

f g = fo + a x fo and h = m + a x m (9.194) 

Since [ S op ,L 0 p ] = 0 we have 

- g~ii(° i 'i j op+&-S op ) 

- g-^«T 0 pg-^a-§ 0 pg-^a[S 0 p,t 0 p] 

= e ~^ & ' Lop e ~^ & ' Sop (9.195) 


This implies that 

e --k & -J°P |f 0 ,?ri+) = e ~* & 'L° p e~* & ' §op \r 0 ,m+) = \r' 0 ,h+) (9.196) 
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and that e ta ' L op/ h carries out the rotation of the spatial degrees of freedom, 
while g-^-Sop/ft carries out the rotation of the spin degrees of freedom. 


If 

then the wave function of | t/j') is 



(9.197) 


(fo,m+ | ip') = (fo,m+\e h0l ' Jop \ip) = (r' 0 ,n+ | ip) (9.198) 

This is the wavefunction of \tp) evaluated at the rotated point with a rotated 
spin quantization direction. 


The spin representation of rotations has a feature which is strikingly different 
from that of rotations in 3-dimensional space. 

Consider a rotation by 2n in spin space. This implies that 

e -i7rt7 op -a _ cog ^ • a sin 7 T = -I (9.199) 

A 27t rotation in spin space is represented by -I. Since the rotations a and 
a + 2 tt ex are physically equivalent, we must say that the spin representation of 
rotations is double-valued, i.e., 

e -^ 0 p-a an( j e -50-op-(a+27r &) _ _ e -ja 0 p-a (9.200) 

represent the same rotation. 


9.3.2. Superselection Rules 

Let us expand on this important point. The 27r rotation transformation operator 
is given by 

Un{ 2 tt) = e _a x i "'^ o p (9.201) 

When operating on the angular momentum state vectors we have 

Un{ 2tt) | j,m) = e 'T”Ap | j,m) = e~ 2 mj \j,m) = ( -l) 2j \j,m) (9.202) 

This says that it has no effect if j = integer and multiplies by -1 if j = half¬ 
integer. 

We usually think of a rotation through 27 t as a trivial operation that changes 
nothing in a physical system. This belief implies that we are assuming all 
dynamical variables are invariant under 27 t rotations or that 

t4(27r) | j,m) = \ j, to) = e~ 2mj \j,m) = (-l) 2j | j,m) (9.203) 

where A is any physical observable. 
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But, as we have seen above, [/*( 2n) is not equal to a trivial operator (not equal 
to the identity operator for all physical states and operators). This says that 
invariance under Un(2ir ) may lead to nontrivial consequences. 

The consequences that arise from invariance of an observable are not identical 
to those that arise from invariance of a state. 

Let U = a unitary operator that leaves the observable F invariant or that we 
have 

[£/,F] = 0 (9.204) 

Now consider a state that is not invariant under the transformation U. If it is 
a pure state represented by | ip), then \%j}') = U\ip) + \if>). The expectation value 
of F in the state | ip) is 

(F) = {i)'\F\if') = (if\U + FU \if) = (ip\ U + UF\ip) = (if\ F \ip) (9.205) 

which implies that the observable statistical properties of F are the same in 
the two states \ip) and | ip'). This conclusion certainly holds for U(2n). Does 
anything else hold? Is there something peculiar to U(2n)l 

It turns out that U(2ir) divides the vector space into two subspaces: 

1 . integer angular momentum - has states |+) where 

U(2tt) |+> = |+) (9.206) 

2 . half-integer angular momentum - has states |-) where 

U( 2tt)|-> = -|-> 

Now for any invariant physical observable B (where 

< + l U(2n)B |-> = ( + | BU(2tt) |-) 

<+|B|-> = -<+|B|-> 

-► (+|B|-) = 0 (9.208) 

This says that all physical observable have vanishing matrix elements between 
states with integer angular momentum and states with half-integer angular mo¬ 
mentum (states in the two subspaces). 

This is called a superselection rule. 

A superselection rule says that there is no observable distinction among vectors 
of the form 

IVV> = l+) + e IV |-> (9.209) 


(9.207) 
= 0 ), we have 
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for different values of the phase ip. This is so because 

<^|Bh/v) = <+|£|+> + <-|£|-> 


(9.210) 


and this is independent of ip. 


What about a more general state represented by the state operator 

W=I>y|*}<j1 (9-211) 


If we break up the basis for the space into + and - subspaces, i.e., 
basis set = {{+ states) , {- states}} 
then the matrix representation of W partitions into four blocks 


W=( ) 

\ W_ + W— ) 


(9.212) 


While, for any physical observable, the same partitioning scheme produces 


MV l) 


(9.213) 


i.e., there is no mixing between the subspaces. This gives 

(B) = Tr ( WB) = Tr + ( W ++ B ++ ) + Tr_ (9.214) 

where Tr ± implies a trace only over the particular subspace. The cross matrix 
elements W+- and W-+ do not contribute to the expectation value of the observ¬ 
ables, or interference between vectors of the |+) and |-) types is not observable. 


All equations of motion decouple into two separate equations in each of the two 
subspaces and no cross-matrix elements of W between the two subspaces ever 
contribute. 


If we assume that the cross matrix elements are zero initially, then they will 
never develop (become nonzero) in time. 

What is the difference between a generator U(2ir) of a superselection rule and a 
symmetry operation that is generated by a universally conserved quantity such 
as the displacement operator 

g - Tl Q-'Pop 

which is generated by the total momentum P op l 

The Hamiltonian of any closed system is invariant under both transformations. 
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Both give rise to a quantum number that must be conserved in any transition. 
In these examples the quantum numbers are 


±1 for U(2tt) and the total momentum 

The difference is that there exist observables that do not commute with P op and 
Q op , but there are no observables that do not commute with U(2ir). 

By measuring the position one can distinguish states that differ only by a dis¬ 
placement, but there is no way to distinguish between states that differ only by 
a 27r rotation. 

The superselection rules from U(2n), which separates the integer and half¬ 
integer angular momentum states, is the only such rule in the quantum me¬ 
chanics of stable particles (non-relativistic quantum mechanics). 

In Quantum Field Theory (relativistic quantum mechanics), where particles can 
be created and annihilated, the total electric charge operator generates another 
superselection rule, provided that one assumes all observables are invariant un¬ 
der gauge transformations. 

This says that no interference can be observed between states of different total 
charge because there are no observables that do not commute with the charge 
operator. 

In a theory of stable particles, the charge of each particle and hence the total 
charge is an invariant. Thus, the total charge operator is simply a multiple of I. 
Every operator commutes with I implying that the charge superselection rule 
is trivial in non-relativistic quantum mechanics. 

Now back to the normal world. 

The techniques that we have developed for the spin = 1/2 system can be applied 
to any two-state system. Here is an example of a two-sided box solved using 
both the Schrodinger and Heisenberg pictures. 


9.3.3. A Box with 2 Sides 

Let us consider a box containing a particle in a state | ip). The box is divided 
into two halves (Right and Left) by a thin partition. The only property that 
we will assign to the particle is whether it is on the Right or Left side of the 
box. 

This means that the system has only two states, namely, | R) and | L) that must 
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have the properties 

(R \ip) ~ amplitude to find the particle on right side if in state | ip) 

(L | ip) = amplitude to find the particle on leftt side if in state | ip) 

We also suppose that the particle can tunnel through the thin partition and 
that 

±(R\iP) = ?-(L\iP) , K real (9.215) 

at in 

How does this system develop in time? 


We solve this problem in two ways, namely, using the Schrodinger and Heisen¬ 
berg pictures. 


Schrodinger Picture 

We define the general system state vector 


ipR 

IpL 


\ ( (R\1>) \ 

)~\ ) 

) ) + ^( i) 


= IpR 
= ip R |i?) + ip L | L) 

The time-dependent Schrodinger equation is 

d I \ „ / 

ih-\iKt)) = ih[ $L] = H [ 


Now we are given that 


dtpR _ I< 

dt ~ ih^ L 


dip 


IpR 

tpL 


) 


— = -—fp*L 
dt. ih 


(9.216) 


(9.217) 


(9.218) 


The state vector must remain normalized as it develops in time so that 

(fp | ip) = \iPr\ 2 + |'0l| 2 = 1 
d(ip\ip) 


dt. 


= 0 = 


d ^*R /._ , dl pR , d% P*L„,. . . L *9ipL 


dt. 


V’fl + V’fT 


dt 


dt 


-IpL + 1p* L - 


dt 


n K i* i , / * K / , dl P*L , , , * 9i Pl 


o = ( 

which says that 


K 


m i 

dt. ih 




dt ih^ H dt ih yR 


(9.219) 
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Therefore we have 


which says that 


H = K 


■(:;) 


The eigenvalues of H are ±K and its eigenvectors are 

i±ki ^U) 

Note that for p = parity operator (switches right and left) we have 

P I ±K) = ± | ±K) 

so these are also states of definite parity. 


If the initial state of the system is 


hK0)> = 


iPr(0) 

Mo) 


then we can write this state in terms of energy eigenstates as 


\m) = 


V’ij(o) 

MO) 


= (Mo) + MO)) I +K) + -j= (MO)-MO)) I -k) 

Since we know the time dependence of energy eigenstates 

| ±K) t = | ±K) 

the time dependence of | tp(t)) is given by 

\m) = -4 (MO) + M0))e-* Kt | +K) 

+ ±(^ R (0)-M0))e + i Kt \-K) 


(9.220) 


(9.221) 


(9.222) 


(9.223) 


(9.224) 


(9.225) 


(9.226) 


(9.227) 


\M)) = o 


1 / (V’fi(O)+V’i(0))e * Kt + (iIjr(0) --0 L (O))e + ^ J 


2 \ (V’h(O)+V’ z,(0))e ft - O>r(0) -^i(0))e + ft J 

/ Kt Kt \ 

ypR( o) cos — - iipL(0) sin — j |i?) 

+ 4#fl(0) sin — + 0 l(O) cos — j |L) 


(9.228) 
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Therefore, the probability that the particle is on the Right side at time t is 


p R = \(R\m )\ 2 


V’fl(O) cos 


Kt 

~h 


- iipL{ 0) sin 



(9.229) 


Now suppose that 

M 0) = 4? and MO) = e^(0) (9.230) 

\/2 

This says that the particle was equally probable to be on either side at t = 0, 
but that the amplitudes differed by a phase factor. In this case, we get 


Pr 


- (1 + sin 6 sin 


2 Kt\ 
~h~) 


(9.231) 


Heisenberg Picture 

Let us define an operator Q such that for a state vector 

\fp) = fpR l-R} + ipL \L) 

we have 

(f/'l Q |'0} = Vl>n\ 2 = probability that particle is on right side (9.232) 
This says that 

(i’\Q\^) = \^ R \ 2 = ^\R)[R\^) (9.233) 

or 

Q = \R)(R\= pure state projection operator = |q oj = ^ + dz (9-234) 

and that the expectation value of the pure state projection operator is equal to 
the probability of being in that state. This agrees with our earlier discussions. 

Now we have 



*= K { \ l) = Ka * 

(9.235) 

Therefore, 


Q(t) =ei At Qe~i At 

(9.236) 

Now as we saw earlier 

AHt 

e h 

± Ka t Kt - . . Kt ^ 

= e h x = cos —— I +1 sin —— o x 
h h 

(9.237) 
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Thus, 


Q(t) = ( 


, Kt t . . Kt „ 

= | cos —— 1 + i sm —— a x 
h h 


)(/+CT,)( 


Kt f . Kt „ 
cos ——1 - i sm —— a x 
h h 


Kt 


L . 2 L 

— + sm —- 

2 

+cos —- 

^ TV 6 

a z - i sm —— 

i 

cos - 

i h 

h 

h 


O Kt ^ „ 

. Kt 

Kt „ , 2 

Kt ^ 

i —(7, + Zsin — cos 

——(Jy - sm 

h y 

——<J 

to 

'> , 

h 

2 Kt, 

h ' 


= cos 


= 1 + cos- a~ + sm- ij v 

h h 


Then 

where 

Now 


p R (t) = (mm)\m) 


Ai 5 




2 Kt. 

1 ir c 


(9.238) 

(9.239) 

(9.240) 


^|i?> = |i?> , a z \L) = -\L) 

a y |i?) = -i I L) , cty I L) = i |i?) 

and we get 

Pr= ^ (1 + sin <5 sin ) (9.241) 

as before. 


9.4. Magnetic Resonance 

How does an experimentalist observe the spin of a particle? 


Classically, a spinning charge distribution will have an associated magnetic mo¬ 
ment. In non-relativistic quantum mechanics particles with internal spin degrees 
of freedom also have magnetic moments which are connected to their angular 
momentum. We write for the magnetic moment operator 

M 0 p = M° 0 r v hital + Mop in 

= ^ 2mc^ J °^ ) 2,7 jic = 2mc 9 s^°p') (9.242) 

where, as we shall derive later, 


j(j + l)-^+l) + s(s + l) 

w *' 1+ MKv 

91 = 9jj0 = 1 and g s = g joj = 2 
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(9.243) 

(9.244) 



Therefore we have 


(9.245) 


M op - —— ( L op + 2 Sop') - —— ( J op + S op ) 


2 me 


which says that M op is not parallel to J op . The energy operator or contribution 
to the Hamiltonian operator from the magnetic moment in a magnetic field 
B(f,t ) is 


H m = -M ov ■ B(f. t ) 


f9.2461 


Spin measurement experiments are designed to detect the effects of this extra 
contribution to the system energy and thus detect the effects of spin angular 
momentum. 


We will return to a full discussion of the effect of Hm on the energy levels of 
atoms, etc in a later chapter. For now we will restrict our attention to spin 
space and investigate the effect of the spin contribution to Hm on the states of 
a particle. 

We will use 

H spin = - 9 -^-B-S op (9.247) 

2 me 

where we have replaced g s by g. We have not set g = 2 since it turns out that 
it is not exactly that value(due to relativistic effects). 

In the Schrodinger picture we have 

ih^\i/>(t)) = H spin \ip(t)) (9.248) 

If we ignore spatial dependences (worry only about spin effects), we have 

< 9 ' 249 > 

ih—( ( fx+ I V’W) \-_l_S_n q ( (^+1 V’W) \ 
dt{ (h-\m) )~ 2 me bop \ (n-\m) ) 

)(Mi) (9 - 250) 

This represents two coupled differential equations for the time dependence of 

the amplitudes (+|V’) an d (-1 '*/'}• When solved, the solution tells us the time 

dependence of the measurable probabilities. 

We can see the physics of the motion best in the Heisenberg picture, where the 


9 

4 me 

9 qh 

4 me 


B ■ d 0 


( {n+l'Ht)) \ 
’{ in- l Ht)) j 


B 

B r + i.B 


v 


B x - iB 
-B, 


v 
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operators, instead of the states, move in time. In this case, we have 
= [Si(t),H spin (t)] 

= [&(i), £*(*)] Bj(t) = -i^- Eijk § k (t)Bj{t) (9.251) 

for each component. The operators are all time-dependent Heisenberg picture 
operators. This gives 

= ^-S op (t) X B(t) = M sp i n (t) X B(t) (9.252) 

at 2 me 

The right-hand-side is the torque exerted by the magnetic field on the magnetic 
moment. 

In operator language, this equation implies that the rate of change of the spin 
angular momentum vector equals the applied torque. This implies that the spin 
vector with q < 0 precesses in a positive sense about the magnetic field direction 
as shown in Figure 9.1 below. 



Figure 9.1: Motion of Spin 


Suppose that B(t) = BqZ (independent of t). We then have 

dSM Q 

dt 

dS x (t) _ gqB 0 ~ 
dt ' 2 me y[ ) 
dSyjt) = gqBp ~ 
dt 2 me x 


(9.253) 

(9.254) 

(9.255) 
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These coupled differential equations have the solutions 


S x (t) = S x (0) cosoj 0 t + S v ( 0) sinwot 

(9.256) 

S y (t) =-S x ( 0) sinwot + <5j/(0) cosccot 

(9.257) 

Sz(t) = S z ( 0) 

(9.258) 

where 

gqB 0 

OJn = - 

2 me 

(9.259) 

Now suppose that at t = 0, the spin is in the +x-direction, which says that 

{S x ( 0)) = (x+\ S x (0) |x+) = ^ 

(9.260) 

{Sy(0)) = {S z (0))=0 

(9.261) 

Therefore, the expectation values of the solutions become 


{S x (t)) = ^cosw 0 1 

(9.262) 

( S y (t )) = sinw 0 t 

(9.263) 

{Sz( o» = o 

(9.264) 


which says that the expectation value of the spin vector rotates in a negative 
sense in the x - y plane (precession about the z-axis). 


Now let us return to the Schrodinger picture to see what precession looks like 
there. We will do the calculation a couple of different ways. 


First we use the time-development operator. For B - Bn we have 

fr 9Q hB ~ 

H = -<7™ • n 


Amc 


If we define 


qB 


u>L = -= Larmor frequency 

TOC 


and let g = 2 we have 
In the Schrodinger picture, 


H = --huj L a op • n 


ih dt = Hspin ^ 

When H spin (t ) is time-independent, we have the solution 

\m) = u(t)\m) 


(9.265) 

(9.266) 

(9.267) 

(9.268) 

(9.269) 
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where 




U(t) = e-i 6t 

(9.270) 

or 


(9.271) 

Since h = 

a unit vector, we have ( a op • h) 2 = I, which gives, as before, 



IV’(f)} = (cos ^Y+i(d op -h) sin^j|V>(0)) 

(9.272) 


This is the solution to the Schrodinger equation in this case. 
Now let 


IV>(0)} = 


/ 1 

\ 1st 

\ | 

0 , 


/ = 1 


z+ 


H+> 


From earlier we have 


(9.273) 


'op 


(9.274) 


We then have 


\ip(t)) = cos^\+)+i 

= cos —— |+) + i 
2 1 1 


n z 

n x + ifiy 
n z 

Tl x '(''Fl'ii 


Tl x 'ITl*. 


Tl x ITly 


\ , , . Wit 

J l+>s,n — 


-n z 


0 


j sir 


1 \ . 

sin- 

2 


, , . . U} L t 

= cos- +) + * sill- 

2 2 

= COS L ^y |+) + i sin ^ ( n z |+) + ( n x + in y ) |-» 


( "• ) 

\ n x + in y ) 


( UJ^t . U)L,t\ 

cos-+ in z sm- 

l 2 2 


+ ) + i(n x + iriy) sin ■ |-) 


(9.275) 


This says that as the initial state | + ) develops in time it picks up some amplitude 
to be in the |-} state. 


Special Case 

Let 


This gives 


n = y -»■ B is in the y- direction 
n y = 1, n x = n z = 0 


I lp(t)) = cos 


Wit 

~Y 


I \ ^ L t | i 

l+>-sm —|-> 


(9.276) 
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which implies that (s z ) flips with frequency v = ujjJA'k or that the spin vector 
is precessing around the direction of the magnetic field. 

Finally, let us just solve the Schrodinger equation directly. We have 


ih—( I \ 

dt \ {h- j i>(t)) J 

1 qh 


B z 

2 me \ B x + iB v -B z ) ^ (n- \ 


B x iB y \( (n+\if>(t)) 


) 


If we choose fi = X,then we have 


ik -l 

lh dt \ b{t ) )~ 2^\ 1 0 )\ b{t ) 
where a(t ) = (n+ \ ?/>(t)) and b(t ) = (h- \ 
and \a\ 2 + \bf = 1 


(9.277) 


)-^(! J)( m ) < 9 - 278 > 


This gives two coupled differential equations 


• - u l, , ; .wl 
a = i — o and b = i—a 


(9.279) 


which are solved as follows: 



(d+b) = i°^-{a + b) -» a + b= (a(0) + 6(0))e* ** 

(9.280) 


(a - b) = - b) -*• a - b= (a(0) - b( 0))e“*~* 

(9.281) 

or 

a(t) = a(0) cos ^-t + ib(0) sin ^-t 

(9.282) 


b(t ) = fa(0) sin ^-t + 6(0) cos 

(9.283) 

For the initial state 

«°» ' ( 6 ( 0 ) ) ' ( 0 ) 

(9.284) 

we get 

a(t) = cos < ^-t , b(t) = 

(9.285) 

and 

a(t) = cos , b{t) = isixi^-t 

(9.286) 

as before. 
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9.4.1. Spin Resonance 

Let us now consider the addition of an oscillating magnetic field perpendicular 
to the applied field B = B 0 z. In particular, we add the field 

B\ = B\ cosiotx (9.287) 

In the Schrodinger picture, we then have the equation of motion 

ih^t \i>(t)) = (B 0 a z + Bicosuta?) \ip(t)) (9.288) 

Now we first make the transformation (shift to rotating coordinate system since 
we know that if the extra field were not present the spin would just be processing 
about the direction of the applied field) 

(9.289) 


which gives 


ih^~ (e*‘ 2 ‘ CT2 IV’'(i)>) = ( B 0 a z + Bicostotd x ) e*‘ 2 *°' 2 \4>'{t)) (9.290) 


dt 

the 1 2 — 

dt 


2 me 2 

(e'V'-W'fl))) 

Sh S (B 0 e 


2 me 2 


(9.291) 

jt • wt ' ; U)t ~ • uit ±. \ . , . . . 

2 Z a z e 2 2 + B\ cosuite 12 z a x e l 2 ff2 J |V» (t)> 


or 


lhe -^ z [e^ z ^\t)) + i^d z e^ az I rm) 


dt 


eh g 


2 me 2 


( B 0 a z + B\ coBwte- i ‘£»*& x e i ‘£»A \ V>'(i)) (9.292) 


Now 


cos lote 1 2 CTz <r x e* 2 t<T2 = cost otd x e 

„2 


Defining 


= a x (cos^ tot + ia z cos tot sin tot) 

„ .1 1 „ _ 1 . „ , 

= a x (- + - cos 2 lot + icr z - sm 2 cot) 

= —h + - ( d x cos 2 Lot + id x a z sin 2w<) 

= + - (<j x cos 2wf + (jj, sin 2w<) 


geB 0 geBi 

io o = —- and wi = 


4?nc 


4mc 


(9.293) 


(9.294) 


723 



we get 


-(("'7'°)^ 

+ (ij x cos 2wf + fry sin 2u;t) \ip'(t)) (9.295) 

The two higher frequency terms produce high frequency wiggles in \ip '(t)). Since 
we will be looking at the average motion of the spin vector (expectation values 
changing in time), these terms will average to zero and we can neglect them in 
this discussion. 


We thus have 

ik Jt = ((~ Y &x ) (9.296) 

which has a solution 

|/(t» = e- i “V(0)) (9-297) 

f1 = [(w - w 0 ) 2 + Wj] 1 ^ 2 and a = U ^ 0 a z - (9.298) 


We note that 


(u-w 0 ) 2 +u\ | 

n 2 


(9.299) 


The final solution to the original problem (again neglecting the higher frequency 
terms) is then(after leaving the rotating system) 




■\m) 


(9.300) 


Example 

Let us choose 

hX 0 )) = !+> = ( q ) (9-301) 

We get 


vt ' • fit - 


\m) = e-^e-^\ + )=e-^ 

( 


( fit f It 

cos- ia sin — 

l 2 2 


, fit . . U) - U!0 . fit , . . OJ 1 . fit „ . 

= e 2 Mcos —|+)-i———sin— cr z \+) + i— sin— (T x \+ 


fl 


) 1+) (9-302) 

») 


Now using 


a z |+) = |+) and a x |+) = |-) 

_,•!-L* i. u>t . uit , 

e 2 * = cos-ism — a. 

2 2 
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(9.303) 

(9.304) 



we have 


\m) = A™ i +>+^ (_) i -> 

(+ \ uit fit Ult fit 

TP ’ = cos — cos- i sin — cos — 


2 2 2 2 
Ul - Ul 0 Ult . fit Ul - Wo . Ult . fit 

- i ——— cos — sin-—— sm — sm — 

fl 2 2 fl 2 2 

r_\ Ul 1 Ult fit U>1 Ult fit 

A y ’ - -i — cos — sm — + — sm — sm — 
fl 2 2 fl 2 2 

Therefore, the amplitude for spin flip to the state |z f) = |-) at time t is 


...... .Ul i Ult . Ill Ul\ . Ult . \Lt 

i- w(t)) = A y ' = -i — cos — sm — + — sm — sm — 
' 1 W/ fl 2 2 fl 2 2 


(9.305) 


(9.306) 

(9.307) 


.Ul 1 . fit i 

= ~ l h sm T e 


and the probability of spin flip is 


, .2 . .2 

Pfu P (t) = |(- I mtf = ^sin 2 — = |^(1-cosOt) 


(9.308) 


(9.309) 


What is happening to the spin vector? If we plot Pfn p (t) versus t as in Figure 
9.2 below we get 


Probability of Spin Flip 
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Figure 9.2: Spin=Flip Probability verus Time 


where the peak values are given by uil/fl 2 . What is the value of uif/fl 2 and can 
it be large? If we plot uil/fl 2 versus ui (frequency of added field) we get 
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Resonance Curve 



Figure 9.3: Resonance Curve 


The peak occurs at w - uo s 0 (called a resonance). Therefore, if to - loq » 
uj i (called off-resonance ), the maximum probability for spin flip is small(that 
corresponds to the Figure 9.2). However, if w - wo ~ 0, which corresponds to 
resonance, the maximum probability r* 1 and the spin has flipped with certainty. 
The spin system preferentially absorbs energy (flipping spin) near resonance. 

This spin resonance process is used to determine a wide variety of spin properties 
of systems. 


9.5. Addition of Angular Momentum 

A derivation of the addition process for arbitrary angular momentum values is 
very complex. We can, however, learn and understand all of the required steps 
within the context of a special case, namely, combining two spin = 1/2 systems 
into a new system. We will do the general derivation after the special case. 

9.5.1. Addition of Two Spin =1/2 Angular Momenta 

We define 


Si,o P = spin operator for system 1 (9.310) 

S 2 ,o P = spin operator for system 2 (9.311) 

The operators for system 1 are assumed to be independent of the operators for 
system 2, which implies that 

Sgj] = 0 for all i and j (9.312) 
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We characterize each system by eigenvalue/eigenvector equations 


where 


Sl op \ Sl , mi ) = h 2 Sl ( Sl + 1)|si,toi) 

(9.313) 

Si,z |si,mi) = hmi |si,mi) 

(9.314) 

^ 2 .opl s 2 ,TO 2 ) = h 2 s 2 (s 2 + 1) |s 2 ,m 2 ) 

(9.315) 

S 2>z |s 2 ,m 2 ) = hm 2 |s 2 ,m 2 ) 

(9.316) 

1 . 1 1 

Si = S 2 = - and m 1 = ±- , m 2 = ±- 

(9.317) 


Since both si and s 2 are fixed and unchanging during this addition process we 
will drop them from the arguments and subscripts to lessen the complexity of 
the equations. 


Each space (1 and 2) is 2-dimensional and thus each has two (2s +1 = 2) basis 
states corresponding to the number of m-values in each case 


Sl= 2’ mi = 2 


1 

si = -,mi = - 

1 1 

s 2 = -,m 2 = - 

1 

S2 = 2> m 2 = - 


1 

2 


1 

2 



(9.318) 

(9.319) 

(9.320) 

(9.321) 


This means that there are 4 possible basis states for the combined system that 
we can construct using the direct product procedure. We label them as 


Itt) - |++) - |+)i ® |+} 2 > It!) - |+-) - | + )i ® |-} 2 (9.322) 

lit) = I-+) = |->i ® |+> 2 , IUM—> = |->i®|-> 2 (9-323) 

so that the first symbol in the combined states corresponds to system 1 and the 
second symbol to system 2. These are not the only possible basis states as we 
shall see. The “1” operators operate only on the “1” part of the direct product, 
for example, 


S 2 Uop + _ ) - ^ 2 si(si + 1) |+-) - h 2 |+-) 

(9.324) 

h 

Si, z | +_ ) - hm 1 1+-) - — I+-) 

(9.325) 

^ 2 , op +~) = h 2 s 2 (s 2 + 1) I+-) = -ft 2 I+-) 

(9.326) 

h 

S 2 ,z I+-) - hm 2 I+-) - - — |+-) 

(9.327) 
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The total spin angular momentum operator for the combined system is 

Sop = §i,op + S 2 ,o P (9.328) 

It obeys the same commutation rules as the individual system operators, i.e., 

[§i,§j]=ihe ijk § k (9.329) 

This tells us to look for the same kind of angular momentum eigenstates and 
eigenvalues 

Sg p |s, m) = h 2 s(s + 1) |s, m) (9.330) 

S z \s,m) = hm\s,m) (9.331) 

To proceed, we need to derive some relations. Squaring the total spin operator 
we have 



$ 0 p ~ ('Si, op $2 ,op) — ^l,op ^2,op 2*Sl ,op ' $2,op 

= “ft 2 / + “ft 2 / + 2S\ Z S 2z + 2S\ X S 2x + 2S\yS 2 y 

(9.332) 

Now using 


S i± - Si x ± iS\y and S 2 ± - S 2x ± iS 2y 

(9.333) 

we have 

2S\ X S 2x + 2SlyS 2 y = S\+S 2 - + Sl-S 2 + 

(9.334) 

and therefore 

Sop = \h 2 i + 2 SizS 2z + S 1+ S 2 _ + 

(9.335) 


Now suppose we choose the four states |++), |+-), |—r), |—) as the orthonormal 
basis for the 4-dimensional vector space of the combined system. We then ask 
the following question. 


Are these basis states also eigenstates of the spin operators for the combined 
system? 


We have 


S z |++) - 


and similarly 


(Siz + S 2z ) |+)i ® |+} 2 - (&, l+)i) ® l+)2 + l + )i ® (S 2z |+) 2 ) 

( \ l + >l) ® l + >2 + I + >1 ® ( \ l+> 2 ) = + \) I + >1 ® l+>2 


ft|++) 


(9.336) 


co> 

~+~ 

n 

o 

(9.337) 


Co > 

“T 

+ 

n 

o 

(9.338) 


Sz\ — >=-ft| — ) 

(9.339) 
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This says that the direct-product basis states are eigenvectors of (total) S z and 
that the eigenvalues are m = 1,0,-1 and 0 (a second time). 

Since each value s of the total angular momentum of the combined system must 
have 2s + 1 associated m- values, these results tell us that the combined system 
will have 

s = 1 -*■ to = 1,0, -1 and s = 0 in = 0 (9.340) 

which accounts for all of the four states. What about the Sj )p operator? 


We have 



S 2 op l ++ > = (l h2! + 2 SuS 2z + S 1+ S 2 . + 5r_5 2+ ) |++) 




- + ^2 2 + + ++ ) 

(9.341) 

Now using 


Si+ |++) = 0 = S+ |++) 

(9.342) 

we get 


^o P l ++ ) = 2 ft 2 ++) = ft 2 l(l + 1) ++) 

(9.343) 

or the state 

++) is 

also an eigenstate of S‘l p with s = 1 




|++) = \s = 1,TO = 1) = |1,1) 

(9.344) 

Similarly, we 

have 

Sip I ) = 2h 2 | ) = ft 2 l(l + 1) | ) 

(9.345) 

or the state 

—> is 

also an eigenstate of Sl p with s = 1 




1—) = |s = 1,TO = -1) = |1,-1) 

(9.346) 


In the same manner we can show that |+-) and |—r) are not eigenstates of Sf )p . 
So the simple direct-product states are not appropriate to describe the combined 
system if we want to characterize it using 

S 2 op and S z (9.347) 

However, since the direct-product states are a complete basis set, we should 
be able to construct the remaining two eigenstates of S 2 p and S z as linear 
combinations of the direct-product states 

|s, to) = E O j smmim2 |toi,to 2 ) (9.348) 

m i ,rri2 

where we have left out the si and s 2 dependence in the states and coefficients. 
In a formal manner, we can identify the so-called Clebsch-Gordon coefficients 
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0Wmim 2 by using the orthonormality of the direct-product basis states. We 
have 


{m' 1 ,m' 2 | s,m) 


E 0,smm 1 m 2 { vn ! x , m' 2 | mi, m 2 ) 

m i ,m2 


E 




smm, m 0 


where we have used 


(9.349) 


{m[,m' 2 | mi,m 2 ) = <5m' 1 m 1 <W 2 m 2 (9.350) 

This does not help us actually compute the coefficients because we do not know 
the states |s,?n). A procedure that works in this case and that can be general¬ 
ized, is the following. 


We already found that |++) = |1,1} and | —) = |1, -1). Now we define the oper¬ 
ators 

*S± = S x ± iS y = ( S\ x + S 2x ) ± i(Si y + S2 y ) = Si ± + <§2± (9.351) 

These are the raising and lowering operators for the combined system and thus 
they satisfy the relations 

S ± \s,m) = h\/s(s + 1) - to(to ± 1) |s,to ± 1) (9.352) 


This gives 

S+ |1,1) = 0 = 11, —1) (9.353) 

as we expect. If, however, we apply S _ to the topmost state ( maximum s and 
m values ) we get 

5_|l,l) = ft N /l(l + l)-l(l-l)|l,0) = ft\^|l,0) 

= (Si- + S 2 -) | + >! ® | + ) 2 = (‘S’l- |+)i) ® | + ) 2 + l + )l ® (^ 2 - | + ) 2 ) 

= /l|-+) + ft|+-> (9.354) 


|i,o) = |s = 1, to = 0) = E= |-+) + E= |+-> 


(9.355) 


Note that the only terms that appear on the right hand side are those that have 
to = TOi + TO 2 . We can easily see that this is a general property since 


S z |s, to) = mh |s, to) = 


Aj smmirri2 


(Siz + S 2z ) |m 1 ,m 2 ) 


= h E 


,(TOi + TO 2 ) |toi,TO 2 ) 


(9.356) 
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The only way to satisfy this equation is for to = mi + to 2 in every term in the 
sum. Thus, our linear combination should really be written as a single sum of 
the form 


|s, TO.) — Q'smm 1 rri2 jmi , 777 - 2 } — |TOl,TO \ ) (9.357) 

mi,m2 mi 

mi+m.2=m 


These three states 


I 1 .1) - I++) “ 1 

|1 ,° > ~ (| -+ > + | +- >) -a M> _ |>| 


1 I 


1 

2 


1 

75 


(9.358) 

(9.359) 

(9.360) 


are called a triplet. 


The state |s = 0,m = 0) = |0,0) can then be found as follows: the state has 
to = 0, S z |0,0) = 0 and thus each state(term) in the linear combination must 
have m = toi + m 2 = 0, which means we must be able to write 

|0,0) = a|+-) + 6|-+) (9.361) 

where 

|a| 2 + |&| 2 = 1 (state is normalized to 1) (9.362) 

Now we must also have 


(1,0 | 0,0) = 0 (since the \s,m) states are orthogonal) (9.363) 


which implies that 


We then have 


—=a + —=b = 0 -> 6 = -a 
s/2 s/2 


2 |a| 2 = 1 



and 


|0,0) = |s = 0,m = 0) 


1 | , 1 

72 l+ } 72 


-+> 


(9.364) 


(9.365) 


(9.366) 


This is called a singlet state. That completes the construction of the angular 
momentum states for the combined system of two spin-1/2 systems. 


We now generalize this procedure for the addition of any two angular momenta. 
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9.5.2. General Addition of Two Angular Momenta 

Given two angular momenta Ji j0 p and J 2 ,op, we have the operators and states 
for each separate system 

Jl,op —■* A,op’ Az; Jl± \jli m l) 

A,op “* ^2,op? J2zi J2± —■ y |j*2’ ^2) 


with 


A 2 oplii> TO i> = h 2 h{j 1 + 1) |ji,TOi), Jiz\ji,mi) = him \ji,mi) (9.367) 

Ji± |ji,mi) = + 1) - mi(mi ± 1) \ji ± 1,777-1) (9.368) 

^ 2 ,opb 2 ,m 2 ) = h 2 j 2 (j 2 + 1) |j 2 ,m 2 ),J 2 z |j 2 ,m 2 ) = hm 2 \ h,m 2 ) (9.369) 

</ 2 ± \j 2 ,m 2 ) = h\)j 2 (j 2 + 1) -m 2 (m 2 ± 1) \j 2 ± l,m 2 ) (9.370) 


Remember that there are 2 ji + 1 possible mi values and 2j 2 + 1 possible m 2 
values. 

Since all of the “1” operators commute with all of the “2” operators, we can find 
a common eigenbasis for the four operators J 2 op , Ji~, J 2 Z in terms of the 


direct-product states 

|ji,j 2 ,TOi,m 2 ) = |ji,mi) ® |j 2 ,m 2 ) (9.371) 

For the combined system we define total operators as before 

Jop = Jl ,op + A,op = total angular momentum (9.372) 

Jz = Jiz + J2z,[^pJz\ = 0 (9.373) 

[j 2 o P Jlop] = 0 , [J 2 opJlop\=0 (9.374) 

[•%,opJz] = 0 , [-/'iop? A] = 0 (9.375) 

These commutators imply that we can construct a common eigenbasis of J 2 , 
Jiz ,J 2 ,o P ^ 2 z using the states |ji, j 2 ,mi,m 2 ) where 

Jop I j, ™) = (j + 1) I j, m) and J z \j, m) = hm \j, m) (9.376) 


There are 2 j + 1 possible m values for each allowed j value. We cannot use the 
operators J\ z , J 2z to construct the eigenbasis for the combined system because 
they do not commute with J 2 p . 

Remember that in order for a label to appear in a ket vector it must be one of 
the eigenvalues of a set of commuting observables since only such a set shares a 
common eigenbasis. 
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We now determine how to write the \j \, j 2 , rn \, m 2 ) in terms of the |ji,mi) <8> 
|j 2 ,?ri 2 ) basis. We have 

\ji,32,j,m) 

= E E E E > 32 . TO 1> m 2 ) (Ji, j 2 , mi, m 2 I ji , j 2 , j , to) (9.377) 

ii fa mi m2 

where 

(ii;i ‘21 TO i, w 2 | ji,j 2 ,j,m) = Clebsch - Gordon coefficients (9.378) 
This corresponds to inserting an identity operator of the form 

1 = ZZZZlil>J2) m l) TO 2>0t,j2) m l> TO 2| (9.379) 

W' o' m.i m.o 


,op 

= + !) (j'i,j 2 ,mi,m 2 \ji,j 2 ,j,m) 

= ft 2 ji(ji + 1) (ji,j 2 ,m 1 ,m 2 \ji,j 2 ,j,m) 

the Clebsch-Gordon(CG) coefficients must vanish unless j ) = j 1 and similarly 
unless j 2 = J 2 - Thus we have 

= E T,\^>h,rn 1 ,m 2 ) (ji,j 2 ,m 1 ,m 2 \ji,j 2 ,j,m) (9.380) 

mi m 2 


Also, as we saw earlier, since J z = J\ z + J 2z we must have 

(ji,j 2 ,m 1 ,m 2 \j z \ji,j 2 ,j,m) = h(m\ + m 2 ) (ji,j 2 ,m 1 ,m 2 \ ji,j 2 ,j,m) 

= hm (j 1 , j 2 , mi, m 2 | ji, j 2 , j, m) 

which implies that the CG coefficients must vanish unless mi + m 2 = m. Thus 
the only non-vanishing coefficients are 

(ji,j 2 ,m 1 ,m 2 | ji,j 2 ,j,m = mi + m 2 ) (9.381) 


and we can write 


\jiij 2 ,j,m) = E E Iii>j 2 ,m.i,?n 2 ) (ji,j 2 ,m.i,m 2 | ji,j 2 ,j,m) 

mi rri2=m—mi 

= E lii > J 2 j mr, m 2 = m-mi) (ji,j 2 ,mi,m 2 = m-mi\ ji,j 2 ,j,m) 

m 1 

For fixed ji and j 2 there are 2ji + 1 possible mi values and 2j 2 + 1 possible m 2 
values. Thus, there are ( 2ji + l)(2j2 +1) linearly independent states of the form 

Iii,i 2 ,m 1 ,m 2 ) = | \j 2 ,m 2 ) 
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and hence the vector space describing the combined system is (2ji + l)( 2 j 2 + 1)- 
dimensional. 

This says that there must be (2ji + l)(2j2 + 1) states of the form \ji,j 2 ,j,m) 
also. 

We notice that there is only one state with in = m-| + m 2 = ji + J 2 , namely, 

mi = j 1 and m 2 = ji (9.382) 

This state has the maximum possible m value. 

There are two states with m = mi + m 2 = j 1 + j ’2 - 1, namely, 

mi = ji - 1 ,m 2 = j 2 and mi = ji, m 2 = jW (9.383) 

and so on. 

For example 

ji = 2 , j 2 = 1 (2(2) + 1))(2(1) + 1)) = 15 states (9.384) 

If we label these states by the m-values only (since j-\ and j '2 do not change) or 
|mi,m 2 ), then we have(in this example) 

m = 3 -»■ 1 state -*■ |2,1) 

in = 2 -> 2 states -*■ |2,0) , |1,1) 

m = 1 -»■ 3 states -»■ |1,0) , |0,1) , |2, -1} 

in = 0 3 states -*■ |0,0) , |1, -1} , |—1,1) 

m = -1 -*• 3 states -*■ |—1,0) , |0, -1) , |—2,1) 

m = -2 -* 2 states -*■ |—2,0) , |—1, -1} 

m = -3 -*■ 1 state -»■ |—2, -1) 

for a total of 15 states. The combined system, as we shall see by construction, 
has these states 


j = 3 -* m = 3,2,1,0, -1,-2,-3 -*■ 7 states 
j = 2 -*■ m = 2,1,0,-1, -2-9-5 states 
j = 1 - 9 - m = 1,0, -1 - 9 - 3 states 

for a total of 15 states. 

The general rules, which follows from group theory, are 
1. The combined system has allowed j values given by 

ji + 32 >j>\ji - J 2 I in integer steps (9.385) 
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2. The total number of states is given by the total number of ?n-values for 
all the allowed j-values or 

31+32 

E (2j + l)=(2j 1 + l)(2j 2 + l) (9.386) 

j=\h-h\ 

We write the addition of two angular momenta symbolically as 
ji ® J 2 = \ji - J 2 I ©\ji ~ J 2 I + 1 © \ji - J 2 I + 2 ©.© ji + J 2 - 1 © ji + J 2 (9.387) 

Examples 

Our original special case of adding two spin = 1/2 systems gives 

ji = J 2 = ^ -► j = 0,1 ^ ^ = 0 © 1 4 states (9.388) 

which is the result we found earlier. 

Other cases are: 

ji = J 2 = 1 -*■ j = 0,1,2 -*• 1 ® 1 = 0 © 1 © 2 -*• 9 states 

ji - 2j2 = l- i -i = l,2,3-9-2®l = l©2©3-»-15 states 

ji = 2 j 2 = 3 -* j = 1,2,3,4,5 2 <g> 3 = lffi2©3©4©5-»35 states 

9.5.3. Actual Construction of States 

Notation 

1. states labeled |7,6) are \j, m) states 

2. states labeled |3,2)^ are |mi,m 2 ) states 

3. we suppress the j-\, j -2 labels everywhere 

Notation 

1. choose j\ and j 2 

2. write down the direct-product |?ni,m 2 ) basis 

3. determine the allowed j values 

4. write down the maximum in state (?n = j\ + j 2 ); it is unique 

5. the maximum m-value corresponds to the maximum j-value 

6. use the lowering operator to generate all other rra-states for this J-value; 
there are 2 j + 1, i.e., 

J- I j, m) = hsjj(j + 1) -m(m- 1) |j, m - 1) 
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7. find the maximum ?n-state for the next lowest j-value; it is constructed 
from the same basis states as in the corresponding m-states for higher 
j-values; use orthonormality properties to figure out coefficients 

8. repeat (6) and (7) until all j-values have been dealt with 

More Detailed Examples (we must learn this process by doing it) 

#1 - Individual system values 

ji - J 2 = — -* mi,m 2 = —, - — (we already did this example) (9.389) 
The basis states are (we use the notation |+-) here instead of 11/2,-1/2)^) 

l ++ ) i l+~) > l _+ ) i I ) 

m = mi + m 2 =1 0 -1 0 

Construction Algebra 

Allowed j-values are j = 1,0 
j = 1 has 2j + 1 = 3 m -values = 1,0, -1 
j - 0 has 2j + 1 = 1 m value = 0 

11,1) = |++) maximum or topmost ( j,m ) state is always unique 
J- 11,1) = \[2h 11,0) = (t/i_ + |++) = h |+ — ) + h \—+) 

l 1 ’ 0 ^ = ^2 + ^2 

J- |1,0) = 72 h |1, -1) = (Jr_ + J 2 _) I+-) + 1-+)) = s/2h I—) 

| 1 ,- 1 > = |—> 

We must now have |0,0) = a|+-) + b\-+) (Rule 7) with 
|a| 2 + |&| 2 = 1 and <1,01,0,0) = -±=a+^b= 0 

which gives 

a = -b = -4= 

v2 

or 

I 0,0 ) = 7f l +_ ) ~ 711“+) 
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All the j-values are now done. We end up with the Clebsch-Gordon coefficients. 


#2 - Individual system values 

j 1 = j -2 = 1 mi, m 2 = 1,0, -1 

The basis states are 

|1,1> 0 |1,O> 0 |1,-1> 0 |O,1> 0 |O,O> 0 
m = mi + m 2 =2 10 1 0 

| 0 , - 1 > 0 1 - 1 , 1 > 0 1 - 1 , o) 0 1 - 1 ,- 1 ) 0 

m = mi + m 2 = -1 0 - 1 - 2 

Construction Algebra 

Allowed j-values are j - 2,1,0 
j = 2 has 2j + 1 = 3 m-values = 2,1,0, -1, -2 

j - 1 has 2j + 1 = 3 to- values = 1,0, -1 

j = 0 has 2 j + 1 = 1 to value = 0 

|2,2) = |1,1) 0 maximum or topmost (j, to) state is always unique 

X |2,2} = 2 h |2,1} = (Jr_ + J 2 _) |1,1} 0 = V2h\l, O> 0 + y/2h |0,1) 0 
|2,l) = ^|l,O) 0+ ^|O,l) 0 

J- 12,1) = n/6 h |2,0) = (Jr_ + J 2 _) |1,O) 0 + ± |0,1> 0 ) 

= /l|l,-l) 0 + 2ft|O,O> 0 + /l|-l,l> 0 

|2)0) = 7/f + 7/g + 771 

Continuing we have 

|2,-l)~hl,O) 0 + -^=|O,-l) 0 
|2,-2) = |-1,-1)® 
which completes the five j = 2 states. 

We must now have 11,1) = a |1,0} 0 + b |0,1) 0 with 


(9.390) 
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which gives 


a| 2 + \b\ 2 = 1 and (2,11,1,1) = -j^a + -^b = 0 

a = - 6= 7I 

or 

l'-'>- v^l () -')« 

We now find all of the j = 1 states 

j- |1,1) = V2ih |1,0> = (Ji_ + Ja-)(-411,0>® - |o, 1>*) 

and continuing 

|1 ’~ 1> = ^ | 0 ,_ 1 > ®~ 75 hl,0> ® 

which completes the three j = 1 states. 

We must now have |0,0) = a |1, — 1)^, + 6|0,0)g, + c|—1,1}^ with 
|a| 2 + \b\ 2 = 1 and (2,01,0,0} = -±=a + b + -^=c = 0 
and (1,01,0,0) = ^a- ^c = 0 

which gives 

a = -6 = c = 

or 

I 0,0 ) = 71" 71 + 71I -1,1 )® 

All the j-values are now done. 

9.6. Two- and Three-Dimensional Systems 

We now turn our attention to 2- and 3-dimensional systems that can be solved 
analytically. 

In the position representation, the wave function 

ip(r) = (r | tp) (9.391) 
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contains all the information that is knowable about the state of a physical sys¬ 
tem. The equation that determines the wave function is the Schrodinger equa¬ 
tion, which we derived from the energy eigenvalue equation 

H\il>) = E\i)) (9.392) 

The general form of the Schrodinger equation in three dimensions is 

h 2 

-V 2 ^(f) + V{r)xl>(r) - (9.393) 

2 TO 

We will use a succession of concrete examples to elucidate the solution techniques 
and the physical principles that are involved. 


9.6.1. 2- and 3-Dimensional Infinite Wells 


2-Dimensional Infinite Square Well - Cartesian Coordinates 

The potential energy function is 


V(x,y) 


JO |x| < | and \y\ < § -» region I 
I oo otherwise -*■ region II 


(9.394) 


This is a simple extension of the 1-dimensional infinite well problem, but it 
is useful because it illustrates all the ideas we will need for more complicated 
problems. 


In region I 


In region II 


d 2 ip(x,y) d 2 ip(x,y) 
dx 2 dy 2 


2mE 

~Y r 


ip(x,y) 


ip(x,y) = 0 


since the potential is infinite over an extended region. 


(9.395) 

(9.396) 


We solve this equation by the separation of variables (SOV) technique. We 
assume 


ip(x,y) = X(x)Y(y) 


(9.397) 


Upon substitution we get 


1 d 2 X i 1 d 2 Y _ 2mE 
X~±c 2 + Y Iky 2 ~ 


(9.398) 


Each term on the left-hand side of the equation is a function only of a single 
variable and hence the only way to satisfy the equation for all x and y is to set 
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each function of a single variable equal to a constant. We have 


If we choose 

2 mE x 
h? 

we get the solutions 


1 d 2 X 2 mE x 

X d* ~ v ‘ co, ' st “ t 

(9.399) 

1 d 2 h 2 mE v 

Y d V >' - V ' “““ 

(9.400) 

E = E x + Ey 

(9.401) 

2 2 mEy 2 2 mE 2 

X ’ S ’ h 2 

(9.402) 

X(x) = A sin k x x + B cos k x x 

(9.403) 

Y(y ) = C sin k v y + D cos k v y 

(9.404) 


with the boundary conditions 

*H)-°- y H) (M05) 


For the function X(x) each boundary condition implies two solution types 

(9.406) 

z 

These are both summarized in the solution 


k T a „ k x a 

A = 0 or sin-= 0 and B = 0 or cos-= 0 

2 2 


X(x) = 


sin ■ 


n x = even 


k x n x = integer 


cos ■ 


= odd 


and similarly for Y(y ) 

Y(y) = 


( . n v iry 

1 sm - JLJ - 


k y = AAL - integer 


cos ny7rv n v = odd 
a y 


The corresponding energy eigenvalues are 

i-i/2 2\ 7T 2 ^ 2 

E=(n x + n v )- 


v '2ma? ’ ’ y 

The 1-dimensional result we found earlier was 

n 2 7r 


n x , n = 1,2,3,4... 


E 


,2„2 h 2 


1 dim — 


2 ma 2 


,n= 1,2,3,4... 


(9.407) 


(9.408) 


(9.409) 


(9.410) 


A plot of these levels for comparison is shown in Figure 9.4 below; we choose 

9 , 9 . 

= 1 


2 ma 2 
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Figure 9.4: Comparison of ID and 2D Infinite Wells 


The major change is not only that the level structure gets more complex, but 
also that a major new feature appears, namely, degeneracy. 

Several of the 2-dimensional well levels are degenerate, which means that dif¬ 
ferent sets of quantum numbers give the same energy eigenvalue. 

In the energy level diagram above the E - 5,10,13,17,20,25,26 and 29 lev¬ 
els are all two-fold degenerate. This degeneracy arises from the fact that 
V ( x , y) = V(-x, - y ) and hence parity is conserved. This means that the correct 
physical eigenstates should be simultaneous eigenstates of both the parity and 
the energy. 

The first three wave functions are 

ttx iry 

tpii(x,y) = cos — cos — 
a a 

ttx . 2ny 

ipi 2 \x,y) - cos — sin- 

a a 

. 2 t tx t xy 

ip 2 i \x,y) - sin-cos — 

a a 

A simple calculation shows that we have 

(11 | 12} = (11 | 21) = 0 -» orthogonal and (21 | 12} t 0 -» not orthogonal 

(9.414) 

We can construct two new eigenfunctions from the degenerate pair that are 
orthogonal using the Gram-Schmidt process. We get 

V>i 2 = Vh 2 + ^21 and fa 2 = ~ V >21 (9.415) 

We then have 

(12+ | 12-} = 0 -»■ orthogonal (9.416) 


E = Eu 

(9.411) 

E = E 12 

(9.412) 

E = E 2 i = E 12 

(9.413) 
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We also note that for the parity operator p we have 

p|ll) = |ll> , P |12+) = |12+) , P 112— > = - 112— > (9.417) 

so that the new eigenfunctions are simultaneous eigenstates of parity and energy. 
Two of the wave functions are plotted below in Figure 9.5 and 9.6 as \if\ 2 . 




Figure 9.6: \ipf 2 \ 2 


Now let us turn to the infinite circular well in two dimensions. 


9.6.2. Two-Dimensional Infinite Circular Well 

We consider the potential in two dimensions 

V(r) = 

The Schrodinger equation in plane-polar coordinates is 



2m 


ld_ 

r dr 


( 4 ) 


1 d' 2 

r 2 dg) 2 
ip(r,ip) = 0 


ip(r,tp) = Eif(r,p) r<a 


r < a 


(9.418) 


(9.419) 

(9.420) 
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We assume (SOV) 

i>(r, p) = R(r)$(p) 

which gives 


h 2 

' 1 d 

L d A\ 

1 1 <9 2 <f>' 

2 in 

rR dr 

l dr) 

r 2 $ dp 2 


We choose a separation constant 


(9.421) 

(9.422) 


1 < 9 2 $ 
<f> dp 2 


4 >(</3) = B sin(a<p + S) 


(9.423) 


The requirement of single-valuedness under a (^-rotation of 27r says that 


sin(a<£ + S) = sin(a</? + S + an) 
a = integer =0,1,2,3... 


Alternatively, we could write 

$(y>) = Be ia<fi 

a= integer = ... - 3,-2,-1,0,1,2,3,... 

(9.424) 

Substitution of this solution leaves the radial differential equation 

o d R dR r,99 9 Tt-> ^ 9 ^inE , . 

r 2 —— + r— + \\ 2 r 2 - a 2 ] R = 0 where A 2 = (9.425) 

dr 2 dr h 1 

This is BesselOs equation. The general solution is 

R(r) = NJ a (Xr) + MY a (Xr) (9.426) 

Now Y a (Xr) -9- oo as r -* 0. Therefore, in order to have a normalizable solution 
in the region r < a (which includes r = 0), we must choose M = 0 and thus we 
have 

R(r) = NJ a (Xr) (9.427) 

and the complete solution is then 

iP ka (r,p) = R(r)$(p) = NJ a (Xr)e ia * (9.428) 

The continuity (or boundary) condition at r = a is 

ipka{o, , <p) = 0 -9 R(a) = 0-9 J a (Xa) = 0 (9.429) 

Thus, the allowed values of A and hence the allowed values of E are given by 

h 2 

A na& — Zna — the n zero of J a "9 E na — — ~pz na (9.430) 

2ma z 
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2-Dim*nsional W«Us 

Ensrgy 



Figure 9.7: Two Dimensional Wells 


We compare the infinite square and circular wells in 2-dimensions using the 
energy level diagram in Figure 9.7 above. 

Note the rather dramatic differences in both the location and degeneracies for 
the two sets of energy levels. 

Some of the wave functions for the 2-dimensional circular well are shown in 
Figures 9.8-9.11 below. 



Figure 9.8: |V>oi| 2 
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Figure 9.9: |Vhi| 2 



Figure 9.10: ^ 03 ? 



Figure 9.11: \ip 23 \ 2 


The 3- dimensional infinite square well is a simple extension of the 2- 
infinite square well. The result for the energies is 

n 2 h 2 

E = 'R'y n x ) 2 2 ’ ^ x ’ = 1) 2,3,4... 


The 3- dimensional infinite spherical well involves the potential 


0 r < a region I 
V{r,0,<p) = \ . 

oo r > a region 11 

The Schrodinger equation is: 

Region I 

1 d / 2 dip \ + 1 d / . dip\ 1 <9 2 0 2?n^ 

r 2 dr V dr) r 2 sin 0 00 l 86 ) r 2 sin 2 6 dip 2 h 2 

Region II 

V»(r, 6,<p) = 0 


dimensional 

(9.431) 

(9.432) 

(9.433) 

(9.434) 
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The equation in region I can be rewritten in terms of the L 2 ov operator as 


1 d ( 2 dip \ L 2 2m 


r 2 g r 




h 2 r 2 


h 2 


Since the potential energy is spherically symmetric, we have 

[L 2 op ,H]=0 

and thus the operators L 2 and H have a common eigenbasis. 


(9.435) 

(9.436) 


Earlier, we found the eigenfunctions of L 2 p to be the spherical harmonics Y) m ((9, ip), 
where 


L 2 Y em {0,v) = h 2 e(l+ l)Y tm (0,<p) 


(9.437) 


Therefore, we can write(SOV) 

ip(r, 9 , tp) = R((r)Yg m (0, ip) (9.438) 

Substitution of this form of the solution gives the radial equation in region I 


d 2 R e 2 dR t t(i+l) D ,2 d n 

IT + — - - 9 Re + k Re = 0 

dr r dr r 


where 


E = 


r 

h 2 k 2 

2m 


(9.439) 

(9.440) 


In region II we have Re(r) = 0. 


The most general solution of the radial equation (which is a different form of 
Bessel’s equation) is 


Re(r) = Aje(kr) + Brie(kr) 

(9.441) 

where 


/ 7T X 1 / 2 

je{kr ) = \2^) Ji+1 ' 2 ^ 

(9.442) 

/ \ 1/2 

rje(kr) = (-l) f+1 y — j J-e-i^kr) 

(9.443) 

are the spherical Bessel functions. 



Now r]e(kr) -*■ oo as r -* 0. Therefore, the normalizable solution in region I 
(which contains r = 0) is 

Re{r) = Aje(kr) (9.444) 

The first few of these functions are 


jo(x) 

ji(x) 


sin a; 
x 

sin a; 



cos a; 


j sin x - 


cos a; 


(9.445) 

(9.446) 

(9.447) 
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The boundary conditions give 

je(k n £d) = 0 where k n e = the n th zero of jz 
The full solution is 

Vw m(0,<p) = Rru(r)Y tm (0,<p) = j t {k n r)Pr{cos6)e imv 

Some of the wavefunctions (absolute squares) are shown in Figures 
below. 




Figure 9.13: nlm - 200 



Figure 9.14: ntm =111 


(9.448) 

(9.449) 
9.12-9.16 


747 











Figure 9.15: nim = 211 


9.6.3. 3-Dimensional Finite Well 

We now consider the potential function in three dimensions 

f 

-Vq r < a region I 


V(r) = 

The Schrodinger equation is 

Region I (same as infinite well) 
1 d ( ? dip' 


0 


r > a region II 


r 2 dr 


n) 




2 ain 2 Q QO 


r * sin 

l & 


d Zr ib 2m T r , 2m ^ , 

2 • + T2 V ^ = 

r 2 sin 0 cftp 2 ft 2 h- 


Region II 
1 d 


( r 2d±\ + 1 d_ ( sin9 d±\, 

r 2 dr\ dr / r 2 sin 2 6 d0\ dO ) r 2 sin 2 6 dip 2 


1 


d 2 ip 2m , 

-- w Et 


If we choose If = -|If| < 0 for bound states, we have 
dip 


(•>%)- 

(^)- 


Llj 


^d_ 
r 2 dr 
^d_ 

r 2 dr V dr ) ft 2 ' 1 " 2 

We then write, as before, 


opr 2 to 2m. . . 

+ ttFoV’ = TT ^ V m region I 
ft 2 ft 2 


h 2 r 2 

L 2 ov PP 


2m 


op = —^ |If| ip in region II 
ft" 


ip(0,ip) = R e (r)Ye m (6,<p) 


and get 


1 d ( 2 dR\ £{£+l)R 2m 2m . . . . T 

- 27 T r 2 — - ' + —l^i? m region I 

or \ or J r z 

1 9 ( 2 dR\ t{i+l)R 2m , . . TT 

r I-— = — \E\ R m region II 


(9.450) 


r 2 dr \ dr ) 


ft 2 


(9.451) 

(9.452) 

(9.453) 

(9.454) 

(9.455) 

(9.456) 

(9.457) 
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which finally becomes 

d 2 R 2 dR 
dp 2 p dp 


1 


£(£+1) 


i?= 0 


p = ar , a 


2777 

= 77(^0 - |-E|) in region I 

AT 


and 


d 2 I? 2 di? 
dy 2 7 d7 


1- 


1(1 + 1 ) 

7 2 


i?= 0 


2 to 


7 = i/3r , /3 = —— |E) in region II 


The solutions are 

I?(r) = Aj e (ar ) 


ft 2 


r < a 


R(r) = Bh^\ij3r) = B[jt(i/3r ) +z?7f(*/3r)] r > a 


where 


jo(z) = 

ji(a;) = 


since 


sin x cos x 


x 

— I sin x - 


i2(*) = (4--) 

Vx^ x/ 

h{x) = xt (- 1 '!) 

V x ax J 


- cosx 


1 d \ sinx 


x 

1 


which are spherical Bessel functions of the 1 st kind , and 

/ v COS X 

%(z) =- 


7i(x) = 


cos x sin x 


r7 2 (a,) = -( A _ I) 

\X 6 X) 


— | cos x-- sin x 

X 2 


Ve(x) = 


f- 1 -) 

V x dx / 


which are spherical Bessel functions of the 2 nd kind , and 


cosx 

x 

nd 


(<x) = — e“ x 

x 

/i ( i 1} (zx) =*(- + 4) 

V x x z / 

/4 1} (u;) = ^ 


13 3 

= I - + -r + 


)' 


h!f\ix) = je(ix) + ir)e(ix) 


(9.458) 

(9.459) 

(9.460) 

(9.461) 

(9.462) 

(9.463) 

(9.464) 

(9.465) 

(9.466) 

(9.467) 

(9.468) 

(9.469) 

(9.470) 

(9.471) 

(9.472) 

(9.473) 

(9.474) 

(9.475) 
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which are spherical Hankel functions of the 1 st kind. 

There is another independent solution for r > a, namely, 

hf\ix) = je(ix) - irp(ix) (9.476) 

which is a spherical Hankel functions of the 2 nd kind , but we must exclude it 
because it behaves like e /3r as r -*• oo and, hence, is not normalizable. 

We have also excluded rje(ar) from the solution for r < a because it diverges at 
r - 0. 


We note for future reference that we have the asymptotic behaviors 
x e , , , 1-3-5- (21- 1) 




je(x) -+ - cos 

X-*oo X 


x - 


(£ + 1)7T 


1 


and i)e(x) -> —sin 

X-fOo x 


x - 


(£ + 1)7T 


hf\x) and hf\x) 


(9.477) 


(9.478) 

(9.479) 


Since both R and dR/dr are continuous at r = a, we can combine the two 
continuity equations into one using the continuity of the so-called logarithmic 
derivative 

— —at r = a (9.480) 

R dr 

For each value of i this gives a transcendental equation for the energy E. 


Examples: 


£=0 £cot£ = -C and £ 2 + £ 2 = 2mV ° a ~ 

h 1 

£ = aa and ( = f3a 

cote 1 1 . 1 ^2 , A 2 2mV 0 a 2 

t = 1 + 

£ = aa and £ = /3a 


(9.481) 


(9.482) 


A graphical solution for the £ - 0 case is shown in Figure 9.16 below. 
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Figure 9.16: Graphical Solution 


For a given well, only one circle exists on the plot. The solutions (energy 
eigenvalues) are given by the intersection of that circle with the cotangent curves. 

The big change from the finite well in one dimension is that the quantity 


2?nVoa 2 

h 2 


(9.483) 


must be larger than some minimum value before any bound state exists (this 
corresponds to the radius of the smallest circle that intersects the cotangent 
curves). In particular, 


2 mVoa 2 

<(if- 

no solution 

(9.484) 

h 2 

2mVoci 2 

<(!f- 

■* 1 solution 

(9.485) 

h 2 

2mVoci 2 

<(!f- 

■* 2 solutions 

(9.486) 

h 2 
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9.6.4. Two-Dimensional Harmonic Oscillator 

The 2-dimensional harmonic oscillator system has the Hamiltonian 


H = ^(pl+pl) + ^mu;\x 2 +y 2 ) 


2 to 2 


-mu; 2 x 2 ) + f — + -m^ 2 y 2 ) 
2 ) \2m 2 w / 


— H X + Hy 


where we have the commutation relations 

[H x ,H v ]=0=[H,H x \ = [H,H y \ 


(9.487) 

(9.488) 


These commutators imply that H, H x and H y have a common eigenbasis. We 
label their common state vectors by 


| E) = \E x ,E y ) = \E x )\E y ) 

(9.489) 

H x | E x ) = E x | E x ) 

(9.490) 

Hy I Hy) = Ey | E y ) 

(9.491) 

H\E) = E\E) = (H x + H y )\E x )\E v ) 


= (E x + E y )\E) 

(9.492) 

E = E x + Ey 

(9.493) 


Now H x (and H y ) each represent a 1-dimensional oscillator. This suggests that 
we define new operators for the x coordinate 


mu „ i +.4 

a :r = \ / -^x + , p x = (oj 


2 ft \/ 2mhu 


where 

and 


[£,Px] = *ft = 1 

H x = hu i j = ftw ^iV x + i j 

As we found earlier, N x has an eigenvalue equation 

N x \n x ) = n x \ti x ) , — 0,1,2,3,. 

We then have 

H x | n x ) = hu (iV* + In*) = hu ^j \n x ) 


(9.494) 

(9.495) 

(9.496) 

(9.497) 

(9.498) 
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or 


(9.499) 


’(”” + 1 ) 


E x = huj | 

In a similar manner, we repeat this process for the y coordinate 

I mu „ i 


where 

and 


IllUJ A 0 ^ / 1 \ 

[zbPy] =ih~* [a y ,a + y ] = 1 


= hoj I 


^a y a y + 9 ) - hui ^ N y + 2 j 


As we found earlier, N y has an eigenvalue equation 

N y \n y ) =n y \n y ) , n y = 0,1,2,3,. 


We then have 


or 


+ I ) 1 


Hy\n y ) = huj< N y + - I n y ) 




Putting this all together we get 

E = E x + E y = huj(n x + n y + 1) = hco(n + 1) 
Table 9.1 below gives the resulting energy level structure. 


n x 

n v 

El hui 

n 

0 

0 

1 

0 

1 

0 

2 

1 

0 

1 

2 

1 

0 

2 

3 

2 

2 

0 

3 

2 

1 

1 

3 

2 


Table 9.1: Energy Levels - 2D Oscillator 


(9.500) 

(9.501) 

(9.502) 

(9.503) 

(9.504) 

(9.505) 

(9.506) 


Each energy value, which is characterized by the quantum number n, has a 
degeneracy equal to (n+ 1). 

The existence of degeneracy indicates ( this is a general rule) that there is an¬ 
other operator that commutes with H. 
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Since this is a central force in the x-y plane, it is not difficult to guess that the 
other operator that commutes with H is the angular momentum about the axis 
perpendicular to the plane (the 2 -axis), L z . Since all of the £-type operators 
commute with all of the y -type operators we can write 

L z = (r op x p op ) z = xp y - yp x (9.507) 

Inverting the standard operator definitions we have 


V h A + „ 1 mhui A + . , , 

^—{ a x + a x ) , p^Th—(9.508) 

y = + ^ = ( 9 - 509 ) 


which gives 

T — ^ ~ + Z 
^ \ a x a y 

- a y a x ) 

(9.510) 

Now using 

\&X 1 -^x] = &x ) 

[oj/,Nj i\-0,y i \ 

[®x> = —a x 

a y , N y ~^ = — a y 

(9.511) 

(9.512) 

we get 

L z ~\ — -7 (^QjyCly, + 0, 
l 

X&y) = — \_Hy> L z \ 

(9.513) 

or 





[H,L Z \= 0 (9.514) 


Therefore, H and L z share a common eigenbasis. This new eigenbasis will not 
be an eigenbasis for H x or H y separately since they do not commute with L z . 

This suggests that we use linear combinations of the degenerate eigenstates to 
find the eigenstates of L z . This works because linear combinations for fixed n 
remain eigenstates of H (they are no longer eigenstates of H x or H y however). 

We define the eigenstates and eigenvalues of L z by the equation 

L z | m) = mh \m) (9.515) 

and the common eigenstates of H and L z by | n, m) where 

H\n, m) = huj(n + 1) |n, m) (9.516) 

L z \n, m) = mh\n, to) (9.517) 

For notational clarity we will write the old states as \n x ) \n y ). 
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For the n = 0 states (there is only one) we have 

L z |0,0) = L z |0) |0) = 0 m = 0 , n = 0 (9.518) 

Now we look at the n- 1 states. We let 

|f p) = a|0) |1) + 6|1) |0) where |a| 2 + \b\ 2 = 1 (normalization) (9.519) 

Since the two states that make up the linear combination are both eigenstates 
of H with n = 1, the linear combination is an eigenstate of H with n = 1 for any 
choice of a and b. We therefore choose a and b to make this state an eigenstate 
of L z . 

We must have 


L z | ip) = L z |1, m) = L z (a |0) |1) + b |1) |0)) 

= -(a^a !/ -a^a x )(a|0>|l> + 6|l)|0» 

i u 

= mh(a |0) |1) + 611) |0>) (9.520) 


Using 
we get 


a\n) = \fn\n - 1) , a + |n) = \/n + 1 \n + 1) 


a- |1) |0> - b- |0) |1) = mh(a |0) |1) + 6|1> |0» 

i i 


ma = ib and, mb = -ia 
Dividing these two equations we get 


which implies that 
and 


a b 

b a 


a 2 = - b 2 -> a = ±ib 


m= -= 

ia 


|+1 a = +ib 
1-1 a = -ib 


Normalization then says that 


to = 


f + 1 a V2 ,b n/2 

[- 1 a = 7 I ’ 6 = + 7 i 


|1, ± 1>=— (|0>|1>=F|1>|0» 


(9.521) 

(9.522) 

(9.523) 

(9.524) 

(9.525) 

(9.526) 

(9.527) 

(9.528) 
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n 

m 

E/hui 

0 

0 

1 

1 

+1 

2 

1 

-1 

2 


Table 9.2: Energy Levels - (n,m) Characterization 


This gives a new characterization of the first two excited energy levels as shown 
in Table 9.2 above. 


Let us now do the n = 2 states. In the same way we assume 

IV’) = a |2) |0) + fo|l) |1) + c|0> |2) 

where normalization gives 

|a| 2 + |b| 2 + |c| 2 = 1 

We then have 


(9.529) 

(9.530) 


L z \%p) = L z |2, m) = mh \i/j) 


which gives 


= mha\2) 0) + mhb 1} |1) + mhc 0} 2} 


= - a + y a x )(a\2) |0) + b |1) |1) + c|0) |2}) 


= j [s/2b \2) |0> + s/2 (c - a) |1) |1 > - n/26 |0 > |2>] 

(9.531) 

ma = -is/2b 

(9.532) 

me = +i\/2b 

(9.533) 

mb = -is/2(c - a) 

(9.534) 


— = -1 -*■ c = -a 
c 



b c-a b 2 a 
— -2a 2 —* b = ±i\/2a 


Putting these pieces all together we have 


a = 


1 

2 


This implies the m-values 




(9.535) 


in = 


756 


+2 

-2 


(9.536) 




or 


| 2 ) 0 ) = -y= | 2 ) | 0 ) + -^= | 0 > | 2 ) 

Thus, the final energy levels are as shown in Table 9.3 below. 


(9.537) 


n 

m 

El hui 

0 

0 

1 

1 

+1 

2 

1 

-1 

2 

2 

+2 

3 

2 

0 

3 

2 

-2 

3 


Table 9.3: Energy Levels - (n,m) Characterization 


What a strange result? The allowed m-values are separated by Am = ±2. 


Let us look at this system using the Schrodinger equation to help us understand 
what is happening. 


We have (using plane-polar coordinates) 


h 2 | 

f l d | 

( r d n 

2 M ' 

l r dr' 

l dr) 


1 d 2 ip\ 
r 2 dip 2 ) 


-Mu> 2 r 2 ip = Etfj 


Choosing 

i>(r, p) = R(r)$(p) 

we get 

h 2 1 (d 2 R 1 dR\ h 2 19 2 $ 1 22 „ 

2 M R \ dr 2 r dr ) 2 Mr 2 $ dp 2 2 


Now we must have 

1 d 2 $ 2 

— - = -m = constant 

$ dp 2 

which produces a radial equation of the form 


1 dR\ 

2 M y dr 2 r dr J 


h 2 m 2 
2 Mr 2 


+ -Muj 2 r 2 \R = ER 

2 ; 


Now we change the variables using 


r = py 


h 

Moj 


(9.538) 

(9.539) 

(9.540) 

(9.541) 

(9.542) 

(9.543) 
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and get 


where 


d?R 1 dR I 2 m2 
dy 2 + y dy + \ V y 2 


E 

hui 


(9.544) 

(9.545) 


As with the solution of other differential equations, the procedure to follow is 
to extract out the asymptotic behavior as y -* 0, oo and solve the equation for 
the remaining function by recognizing the well-known equation that results. 


As y -»■ oo the dominant term will be y 2 and the equation for this behavior is 

d j4~y 2 R = o (9.546) 

dy 2 

which has a solution R -*■ e~ v //2 . 


As y -* 0 the dominant 


term will be 1/y 2 and the equation for this behavior is 


dy 2 y 2 


(9.547) 


which has a solution R -*■ y^ m \ where we have excluded any negative powers 
since that solution would diverge at y = 0. 


Therefore we assume R = y^e y l 2 G(y). Substitution gives the equation for G 
as 

d 2 G (2 \m\ + 1 . 2y \^l + ( £ -2-2\m\)G= 0 (9.548) 

/ dy 


dy 2 


y 


Changing the variable again to z - y 2 we have 

Vn /uuii \ nn / £ - 2(|m| + 1)' 

-I- i - — I ■ - -I- i 

dz 2 


1 \dG / £ -2(H + l) \ 

) dz \ Az ) 


G = 0 


(9.549) 


If we are clever, we recognize this as LaguerreOs equation. If not, we make a 
series solution substitution 

G(z) = f) b s z s (9.550) 

a =0 

which gives the recursion relation 


b s +i 

~b7 


- for large s 
s 


(9.551) 


This says that unless the series terminates (becomes a polynomial in z) it will 
behave like e z - e y which implies that the solution for R(y) will diverge for 
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large y and thus, not be normalizable. 


If we choose the maximum s -value to be s max = n r , then we can terminate the 
series by choosing 

e - 2 \m\ + 2 + 4 n r (9.552) 

which then gives us the allowed energy eigenvalues 

En T ,m = hu)(\m\ + 2 n r + 1) (9.553) 

The polynomial solutions are the generalized Laguerre polynomials 


The first few Laguerre polynomials are 

L k 0 (z) = 1 , L k 1 (z) = l + k-z 
L k 2 (z) = ^(2 + 3k + k 2 -2z(k + 2)+z 2 ) 
The full wave function is 

n r , m (r,<P) = r W e^t' e im * 

and the energy level structure is shown in Table 9.4 below. 


E/huj 

n r 

m 

degeneracy 

1 

0 

0 

1 

2 

0 

+1 

2 

2 

0 

-1 


3 

0 

+2 

3 

3 

1 

0 


3 

0 

-2 


4 

0 

+3 

4 

4 

1 

+1 


4 

1 

-1 


4 

0 

-3 



Table 9.4: Energy Levels - 2D Oscillator 


(9.554) 

(9.555) 


(9.556) 


which is the same structure (with different labels) as in the operator solution. 
The fact that Am = 2 in the 2-dimensional case is one of many peculiarities 
associated with two dimensions that does not appear three dimensions. 


9.6.5. What happens in 3 dimensions? 

In Cartesian coordinates we have a simple extension of the 2-dimensional case. 


E = hu) ( n x + n y + n z + - | = 




(9.557) 
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The degeneracy is 


(n + l)(?r + 2) 
2 


(9.558) 


The energy level structure is shown in Table 9.5 below. 


E/huj 

n x 

n v 

n z 

n 

degeneracy 

3/2 

0 

0 

0 

0 

1 

5/2 

1 

0 

0 

1 

3 

5/2 

0 

1 

0 

1 


5/2 

0 

0 

1 

1 


7/2 

2 

0 

0 

2 

6 

7/2 

0 

2 

0 

2 


7/2 

0 

0 

2 

2 


7/2 

1 

1 

0 

2 


7/2 

1 

0 

1 

2 


7/2 

0 

1 

1 

2 



Table 9.5: Energy Levels - 3D Oscillator 


In spherical-polar coordinates, we can follow a procedure similar to the plane- 
polar 2-dimensional case to get 


i’n ri e,m(y,9, l P) =U e e L^ 2 (y 2 )Ye m (d,tp) (9.559) 

where 

r = PV , P 2 = (9.560) 

M.UJ 

The corresponding energy values are 

E nr)t = huj(2n r +l+^) (9.561) 

which gives Table 9.6 below. 


E/hoj 

n r 

t 

3 

II 

to 

-s 

+ 

degeneracy 

3/2 

0 

0 

0 

1 

5/2 

0 

1 

1 

3 -» m=± 1,0 

7/2 

1 or 0 

0 or 2 

2 

6 -*■ m=± 2, ± 1 


Table 9.6: Energy Levels - 3D Oscillator 


Finally, we look at a case we skipped over(because it is the most difficult example 
of this type). 
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9.6.6. Two-Dimensional Finite Circular Well 

We consider the potential in two dimensions 


F(r) = 


-Vo r < a 
0 r > a 


(9.562) 


The Schrodinger equation in plane-polar coordinates is 


h 2 \l d ( d \ 19 2 1 

r— i>(r,<p) -V 0 ip(r,p) = Eip(r,p) r<a (9.563) 

Zm r or \ or) r z otp A 


h 2 \ Id ( d\ 1 d 2 I . , 

= Eip(r,ip) r> 

Zm [ r or \ or ) r z Oip z 


We assume (SO V) 


ip(r,ip) = R(r)$(p) 


which gives 


fe 2 f 1 d I dR\ 1 1 <9 2 $1 „ T „ 

2m rRdrVdrj + r 2 + 0 


r 2 $ dp 2 


h 2 i a / &R\ i i a 2 $l ^ 

--I r — I +- —hi r ^ cl 

2m rR dr \ dr ) r 2 $ dp 2 


r 2 $ dp 2 


We choose a separation constant 


1 < 9 2 $ 

— „ = -a 2 -+ $(p) = B sin(a<£ + 5) 

CD r1/n z 


(9.564) 


(9.565) 


(9.566) 


(9.567) 


(9.568) 


The requirement of single-valuedness under a ^-rotation of 27 t says that 

sin(a<£ + 5) = sin(a<^ + <5 + an) 

-*■ a = integer = 0,1,2,3... 

Alternatively, we could write 

$(<?) = Be laip 

a = integer =-3, -2,-1,0,1,2,3,... 

Substitution of this solution leaves the radial differential equations 

r 2 -j~p+ r -j-+ [f3 2 r 2 ~ a 2 ]R = 0 r<a where (3 2 = ^ (9.569) 

dr z dr h- 

2 d 2 R dR r.22 2i n r\ i \2 2 mE . 

r —— - + r— + A r - a i? = 0 r>a where A = (9.570) 

dr z dr h z 


These are BesselOs equations. The general solutions are 

R(r) = N J a ((3r) + MY a (/3r) r<a 
R{r) = PJ a (Xr) r>a 


(9.571) 

(9.572) 



and the complete solutions are then 


4>ka(r,ip) = 


i?(r)<!>(<^) = ( NJ a (f3r ) + MY a ((3r))e za(p r < a 


R(r)$(<p) = PJ Q (Xr)e loufi r>a 
The continuity (or boundary) conditions at r = a are 


NJ a (/3a ) + MY a (/3a ) = PJ a {\a) 


Nf3 


dJ a (f3r) 


d ( fir ) 


+ M/3 


dY a (/3r) 


d (f3r) 


= P X 


dJ a {Xr) 


d (A r) 


Let us consider the case a = 0. We have 


NJ 0 (/3a) + MY 0 (/3a ) = PJ 0 (Xa) 


N(3 


dJ 0 (/3r ) 


d {(dr) 


+ M/3 


dY 0 ((3r) 


d ( f3r ) 


= PA 


dJo(Xr) 


d (Ar) 


where 


2 4 

T , , . x x 

Jo(x) = 1 - VTT + 


2 2 2 2 4 2 2 4 2 6 


+ .... 


da; 


■3 2)5 7 

- - Ji(-r) - -1 2 “ 224 + 2 2 4 2 6 ~~ 2 2 4 2 6 2 8 


dJo(x) T „ „ lx x' 


) 


W(.'E) = - 


( 1 ) 


Mo + 7 


J 0 (a ;)7 = 0.5772156 


dYo(x) _ 2 

dx 7T 


fe (|) +7 


dJo(x) 2 

—,-+ —s/oW 

aa; 7ra; 


Clearly, the 2-dimensional finite well is very difficult. 


(9.573) 

(9.574) 

(9.575) 

(9.576) 

(9.577) 

(9.578) 

(9.579) 

(9.580) 

(9.581) 


9.6.7. The 3-Dimensional Delta Function 

We now consider a particle is moving in 3-dimensions under the action of the 
attractive delta function potential at ? given by 

ft 2 

V(r) = - 6(r-a ) (9.582) 

V ' 2 Ma J y ’ 

Since this is a central force we know that we can write the solutions in the form 


= R(r)Y em (6,ip) 


(9.583) 


The Schrodinger equation is 


h 2 / d 2 R | 2 dR 1(1+1) ' 
2 M \ dr 2 r dr r 2 


h 2 

2 Ma 


5{r - a)R - 


2 7.2 


h 2 k 

~2M 


-R= 0 


(9.584) 
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where, since we are looking for bound states, we have set 

h 2 k 2 

E = --<0 (9.585) 

2 M y ’ 

Now the function R(r ) must be continuous everywhere. But as we saw earlier in 
the 1-dimensional case, the derivative of R is not continuous for delta functions. 
The discontinuity at r = a is found by integrating the radial equation around 
the point r = a. 


h 2 

2 M 


° + . 
J r 2 dr I 


d 2 R(r ) 2 dR(r) £(£+l) n/ . 

-— +-— - ——j— ~R( r ) 


dr 2 

h 2 

2 Ma 


dr 


) 


j~ r 2 drS(r - a)R(r) - lyj^jr J r 2 drR(r) = 0 


which gives the second boundary (continuity) condition 

dR(a + ) dR(a~) 


dr dr 

For r > a and r < a we then have the equation 
h 2 / d?R 2 dR £(£+ 1) 


= -R(a) 


K )- 


h 2 k 2 
~2M 


R= 0 


which has as a solution for the case £ = 0 

1 


R(r)= -{Ae~ kr + Be kr ) 


(9.586) 


(9.587) 


(9.588) 


For R(r) to be well behaved at r = 0 we must choose B = -A. Therefore, the 
solution for r < a is 


R(r ) = - sinhfcr 


(9.589) 


For R(r) to be well behaved as r -»• oo we must choose B = 0. Therefore the 
solution for r > a is 


R(r) = -e~ kr 
r 

The boundary conditions at r = a then give the equations 
— sinh ka = —e~ ka -»■ csinh ka = be~ ka 


and 

0 = -ka(be~ ka + ccoshfca) + c sinh ka 
Eliminating b and c we get a transcendental for k (or E) 

a _ 1 - e~ 2ka 
a 2 ka 


(9.590) 

(9.591) 

(9.592) 

(9.593) 
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The right-hand side of this equation has a range 


0 < 


1 - 


-2 ka 


2 ka 


< 1 


Therefore, the allowed range of the parameter a, in order that an t 
state exist, is 0 < a < a. 


(9.594) 
= 0 bound 


Just this once let us ask .... what can we say about £1 0? This will illustrate 
some properties of the spherical Bessel functions. We have the radial equation 

f 2 -r-w + 2r^ + (k 2 r 2 - £(£ + 1)) i? = 0 (9.595) 

dr z dr 

which as we have seen before is the spherical Bessel function equation with the 
general solution 

R(r) = Aj t (kr) + Brje(kr) (9.596) 

For r < a we must choose B = 0 (j]t(kr) diverges at r = 0) and for r > a we must 
choose B = iA , which leads to a solution that drops off exponentially as r -*• oo. 
So we finally have 


r < a R(r) = Gjt(kr) (9.597) 

r>a R{r) = Hh^\kr) = H(j^(kr) + irjg(kr)) (9.598) 


The boundary conditions then give 

Gje(ka) = Hh^\ka) = H(jt(ka ) + irje(ka)) 

G ■ dh^\ka) ^dje(ka) 

a dr dr 

_ dj t (ka) _ + .gdvdka) 


dr 


dr 


(9.599) 


(9.600) 


Now 

^ We-i(kr) - (t+l)j M (kr)] (9.601) 

and similarly for % and Therefore, we get 

Gje(ka) = (je(ka) + irje(ka))H (9.602) 

Gk 

[ije-i(ka) -(£ + l)j M (ka)] 

Ck C 

- —~ [£ru-i(ka) -(£ + l)ru+i(ka)] - je(ka) 

21+1 a 

= x [£je~i(ka) - (£+l)j M (ka)]H (9.603) 


or 

rji(ka) _ [£r/e-i(ka) -(£+ l)r] M (ka)] - ^^p-je(ka) 
je(ka) + irje(ka) [Ije-^ka) -(£+ l)je+i(ka)] 

This is the transcendental equation for the energy!!! That is enough for nonzero 
angular momentum! 
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9.6.8. The Hydrogen Atom 

Schrodinger Equation Solution 

The potential energy function 


V(r) 


Ze 2 


r 


(9.604) 


represents the attractive Coulomb interaction between an atomic nucleus of 
charge +Ze and an electron of charge -e. 


The Schrodinger equation we have been using describes the motion of a single 
particle in an external field. In the hydrogen atom, however, we are interested 
in the motion of two particles (nucleus and electron) that are attracted to each 
other via the potential above (where r is the separation distance between the 
two particles). 


We start by writing the Schrodinger equation for the two particle system. It 
involves six coordinates (three for each particle). We use Cartesian coordinates 
to start the discussion. We have 

-*2 - n 2 

H = Pl ' op + P2 ’°p + y (f) where f = fi-f 2 (9.605) 

2m i 2 to2 


which gives 


h 2 

2m i 


( d 2 d 2 d 2 \ 

yy + \<P(Xl,yi,Zl,X 2 ,y2,Z2) 

\ oxf oyi ozf) 

h 2 (d 2 d 2 d 2 \ 

+ V(x 1 - x 2 ,yi - y 2 ,zi - z 2 )(j>(x 1 ,y 1 ,z 1 ,x 2 ,y 2 ,Z 2 ) 


= E<j>(x 1 ,yi,z 1 ,X 2 ,y 2 ,Z 2 ) 


(9.606) 


We now introduce relative and center-of-mass(CM) coordinates by 


x = x 1 - x 2 , y = yi-y 2 , z = z-!-z 2 

r= ( x,y,z) 

MX = m\Xi+m 2 x 2 , MY = mij/i + m 2 y 2 , AIZ = m\Z\ + m 2 z 2 


R = (X,Y,Z) 

M = mi + m 2 - total mass of the system 


Substitution gives 


h 2 
2 M 


d 2 d 2 
dX 2 + W 2 + 



<t>{x,y, 


z,X,Y,Z ) 


2 y 


/ d 2 d 2 d 2 \ 
\ dx 2 dy 2 dz 2 / 


<t>(x,y,z,X,Y,Z) 


+ V(x, y, z)(f>(x, y, z, X, Y, Z) = E</>(x , y, z, X, Y, Z) 


(9.607) 
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where 

p =-= the reduced mass (9.608) 

TOl + m2 

We can now separate the variables by assuming a solution of the form 

<Kx, V , X, Y, Z) = ^(x, y, z)*(X, Y, Z ) (9.609) 

which gives the two equations 

h 2 

- V~ip(r) + V (r)'il)(r) = Eip(f) (9.610) 

2/i 

= (9.611) 

The second equation says that the CM of the two particles is like a free particle 
of mass M. 


The first equation describes the relative motion of the two particles and is the 
same as the equation of motion of a particle of mass /./, in an external potential 
energy V(r). 


In the hydrogen atom problem we are only interested in the energy levels E 
associated with the relative motion. In addition, since the nuclear mass is so 
much larger than the electron mass, we have 

ynm e = rrielectron (9.612) 

This is a central force so we assume a solution of the form 


V’(f) = R{r)Y tm {0,<p) 


(9.613) 


and obtain the radial equation 


liii lU™\-?*R^+'fR=ER 

2p r 2 dr V dr ) r 2 pr 2 


(9.614) 


where E < 0 for a bound state. We follow the same approach as before. We 
change the variables so that the equation is in dimensionless form by introducing 
p - ar where 


We get 


For p 


2 = W\ 

h 2 


and 


Idl 2 dR 
p 2 dp V dr ^ 
oo the equation becomes 


LA. 

p 2 dp 


) 

( 


2 pZe 2 Ze 2 
ah 2 = LA 


(\ 1 1 ( 1 + 1 ) 

\P 4 p 2 


2 dR 

P AA P 


)- 


-R = 0 
4 


/ X 1 / 2 

( 2 ^) 

(9.615) 

j i? = 0 

(9.616) 


(9.617) 
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which has the solution 


R -* p n e ± ^ p (9.618) 

where n can have any finite value. Since we want a normalizable solution, we 
will look for a solution of the form 

R -» F(p)e~^ p (9.619) 


Substitution gives an equation for F 


d 2 F 

( 2_i 

\dF + 

A - 1 £(£+ 1)' 

dp 2 

VP 

) dp 

P P 2 


(9.620) 


We solve this equation (it is LaguerreOs equation) by series substitution. 

F{p) ~ p s ( a o + dip + 0 , 2 p 2 + ■•■■) = p s L{p) (9.621) 

where ao + 0 and s|ge0. We must also have that F( 0) is finite. Substitution 
gives an equation for L. 

P 2d ri+P [2(s + 1) - p] ^ + [p(A- s- 1) + s(s + 1) -£(£ + 1)] L = 0 (9.622) 
dp z dp 

If we set p = 0 in this equation and use the fact that L is a power series we get 
the condition 

s(s + 1) -£(£ + 1) = 0 (9.623) 

or 

s = f or s = —(£ + 1) (9.624) 

Since R must be finite at the origin we exclude the second possibility and choose 
s = t. This gives 


p 2d j^+p [2(e + 1) - p] + [p {A -£-l)]L = 0 (9.625) 

dp z dp 

Substituting the power series gives a recursion relation for the coefficients 

v + £ + 1 — A 


Qv+1 ~ (v+l)(v + 2(. + 2) a ’' 


(9.626) 


If the series does not terminate, then this recursion relation behaves like 


ay+1 1 
a u v 


(9.627) 


for large v. This corresponds to the series for p n e p . Since this will give a 
non-normalizable result, we must terminate the series by choosing 


A = n = positive integer = n + l + 1 = total quantum number 


(9.628) 
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where n'(= radial quantum number) is the largest power of p in the solution L. 
Since n’ and n are non-negative integers, n = 1,2,3,4,... 


We thus obtain the solution for the energies 

pZ 2 e 4 


E n = ~\E n \ = - 


ZV 


2ft 2 n 2 2aon 2 


(9.629) 


where 

ft 2 

a 0 = Bohr radius = —- (9.630) 

pe 1 

Unlike the finite square well, where we had a finite number of bound state levels, 
in this case, we obtain an infinite set of discrete energies. This results from the 
very slow decrease of the Coulomb potential energy with distance. 


The Laguerre polynomial solutions are given by the generating function 

G(p,s) = ^- = , 8<1 (9.631) 

1 - S 9=0 ql 

By differentiating the generating function with respect to p and s we can show 
that 


dj±j A (jjJ-jpj —1 . . 

-7= (9.632) 

dp dp 

Lq+l = (2(7 + 1 - p)L q ~ q 2 Lq-l (9.633) 

The lowest order differential equation involving only L q that can be constructed 
from these two equations is 

d 2 L a , N dL a 

p^ L + (l-p)-£ + qLq = 0 (9.634) 

This is not quite our original equation. However, if we define the associated 
Laguerre polynomials by 

LP q (p) = ^~Mp) (9.635) 

then differentiating the differential equation p times we get 
d 2 L p 0 dLP 

P ~d^ 2 ~ + ( ' P+1 ~ P ^~d^ + ( ' q ~ P ^ LPq = 0 (9.636) 

Setting A = n we then see that the solutions to the Schrodinger equation are 

L 2 n e + + l(p) (9.637) 

which are polynomials of order (n + () - (2£ + 1) = n - t - 1 in agreement with 
the earlier results. 
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G P (p, s) = 


5 < 1 


We can differentiate the generating function p times to get 

(sye-^ " LP(p) q 

(i -sy +1 to y s 

Explicitly, we then have 

r2^+1 / \ n ~t\ [(n + l)!]V 

to (n-i-1- k)\(2£ + 1 + k)\k\ 

The normalized radial wave functions are of the form 

i X /2 


with 


and 


l(-) 

3 (n-£-l)\ } 

(Vnao ) 

2 n[(n + £)\f) 


h 2 , 27 

ao = —- and p = - r 

p,e z nao 


ipnem(r,6,p) = R ne (r)Y em (8,ip) 
The first few radial wave functions are 

_ t 2 Zr 

= 1 — 1 2e a o 


R W {r) = (-\ 

\a 0 J 

«-(£)>?) 

(r) = ( 


R 20 

R 21 


2a 0 / \ a 0 

3 1 2 Zr ^ 
e 2a o 


e 2a o 


JA 1 


>2ao/ ao\/3 

What about degeneracy since the energy values do not depend on 


(9.638) 

(9.639) 

(9.640) 

(9.641) 

(9.642) 

(9.643) 

(9.644) 

(9.645) 
and ml 


For each value of n, £ can vary between 0 and n — 1, and for each value of 
these £ values m can vary between -l and £ (2£ + 1 values). Therefore the total 
degeneracy of an energy level E n is given by 

Y (21+ 1) = 2 n( ~ n ~ ^ + n = n 2 (9.646) 

e=o 2 

The degeneracy with respect to m is true for any central force (as we have seen 
in other examples). The £ degeneracy, however, is characteristic of the Coulomb 
potential alone. It is called an accidental degeneracy. 


Some useful expectation values are: 
a 0 


(r) 


G'O To 
n£m 2^7 


■£{£+\)] 


11 

C i nim 


aon 


1 


r 2 l n em agn 3 (f+i) \r 3 l n em a^n 3 £(£ + \){£ + 1) 


7 3 


(9.647) 

(9.648) 
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9.6.9. Algebraic Solution of the Hydrogen Atom 

Review of the Classical Kepler Problem 

In the classical Kepler problem of an electron of charge -e moving in the electric 
force held of a positive charge Ze located at the origin, the angular momentum 

L = fxp (9.649) 


is constant. The motion takes place in a plane perpendicular to the constant 
angular momentum direction. NewtonOs second law tells us that the rate of 
change of the momentum is equal to the force 


dp Zj g t 

— =- —r where r = |r| and f = - = unit vector (9.650) 

at r 2 r 

The so-called Runge-Lenz vector is defined by 

A = „ \ ( Lxp) + f (9.651) 

If we take its time derivative we find 


dL 

dt 

dA 

dt 


_ . ? _ dr o ~ dr ,*^dr 2 - dr 

0 and L = mr x — = mr r x — + mr(r x r) — = mr r x — 

Jj. „7 -L v / Jj. J-L 


dt 


dt 


1 


1 


dt 

? dp dr 

V dt’ dt 


. - dp dL dr 

Ze 2 m dt dt P dt Ze 2 m 

1 , - f . dr , „ dr , dr 

,df,„ dr., dr dr d(r-f) dr 

-(— (r-r)-(r- —)) + — =-+ —-- + — = 0 

v dt v dt dt dt dt dt 


dt 


(9.652) 


(9.653) 


Thus, the vector A is a constant of the motion. It corresponds physically to the 
length and direction of the semi-major axis of the classical elliptical orbit. The 
equation of the elliptical orbit is easily found using the A vector. 


A - f = ar cos 6 = 


Ze 2 m 


(L x p) ■ f + f • f 


1 


ar cos v = - 


+ r 


Ze 2 nri 
L 2 

Ze 2 in 
1 Ze 2 m 
r 


L ■ (r x p) + r 


L 2 


(1 - a cos 9) -+ orbit equation (conic sections) 


(9.654) 

(9.655) 

(9.656) 


where 


a = | .A | (9.657) 

= eccentricity and the direction of A is from the origin to the aphelion 
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The Quantum Mechanical Problem 

In order to make this useful in quantum mechanics, the classical Runge-Lenz 
vector must be generalized. In classical physics 

Lxp = -pxL (9.658) 

In quantum mechanics, this identity is not valid since the components of L op 
do not commute. The correct quantum mechanical generalization (remember it 
must be a Hermitian operator if it is a physical observable) of the vector A is 

A°p — 7 ^—^ (.L 0 p x Pop ~ Pop x Lop') + ^ | (9.6o9) 

This satisfies [ZZ, A op ] = 0, which says A is a constant. It also satisfies A op -L op = 

0. 


We can derive the following commutators: 

[Zj,;, A :j ] — ifiEijkAk , — ihEijkPk ? \Li , Vj ] — ihc pv ^ (9.660) 

which gives 

Aq P — 2 (2i op x Pop + ihpop) + ^-j (9.661) 

We now derive two important properties of A. The first property follows from 
the commutators 


[(iop x Pop) i iPj\ + ^Pii (.L op X Pop)^.J - 0 

^(z^op x Pop)^ ^ (.L 0 p x Pop) j j = -iheijkLkP — -itiEijkP Lk 


(L op x pop). , 

' L J- J 


' 1 (Lop X Pop) j 


■ 1 

= 2ih£ijk — 


which leads to 

tMi] = | Z 2 e 4 m) £ijk ^ k 

where 

H = Hamiltonian = —^ _ Ze 2 ( - J 

2 to V r /op 

The second property follows from the relations 
( L op x p op ) • (L op x p 0 p) = L 0 pP 0 p 

^ —^ (L op x Pop) (L 0 p x Pop) ' ^ — | = | *Pop 

Pop-(-) =(-) -Pop-Zihl-) 

\r / 0 p \r / 0 p \r) op 

Pop * L op — L op • Pop — 0 
Pop * (L op x Pop) + (L 0 p x Pop) ' L op — 2 ihpop 


(9.662) 

(9.663) 

(9.664) 

(9.665) 

(9.666) 

(9.667) 

(9.668) 

(9.669) 

(9.670) 

(9.671) 
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which lead to 


(9.672) 


A op - A op • A op - 1 + (Lop + h ) 

We now define the two new operators (they are ladder operators) by 



(9.673) 


Using the relations we have derived for [A.j, A A and [Lj,Ajl we have 

[4’41 = ° (9.674) 

which means they have a common eigenbasis. We also have 

[4>,i> 4>,j] = ih e ijkIop,k (9.675) 

which are the standard angular momentum component commutation relations. 
Since they each commute with H, they also have a common eigenbasis with H. 


Therefore, we can find a set of states such that 


(4) 2 |V>)=i ± (i ± + l)ft 2 |V’> and H\ip) = E\ip) (9.676) 

We can show that (4) 2 = (4) 2 , which implies that *+ = i— We also have(as 
before) 

=0,|,1,|,2,|,3,... (9.677) 

Since A op ■ L op = 0 we get 


2 Wop) 2 + ( 4 P ) 2 ] + 

r, . , . _. x ,1,2 A e TO 

[4z + ( ?+ + l) + l]ft“ =- — 

Z 2 e A m Z 2 e A m 
~ ~ 2(2i + + l) 2 “ 2n 2 


(9.678) 

(9.679) 

(9.680) 


where we have set n - 2 i+ + 1 = 1,2,3,4,5,..., which are the correct energy 
values for hydrogen. 


While the energy depends only on the quantum number i+ (or n) each state has 
a degeneracy depending on the number of ^-component values for each i + value. 
This is 

(2 i + + 1)(2 i- + 1) = (2 i + + l) 2 = n 2 (9.681) 

which is the correct degeneracy. 
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9.6.10. The Deuteron 


A deuteron is a bound state of a neutron and a proton. We can consider this 
system as a single particle with reduced mass 


A* = 


^proton TTlneutron 
Wlproton neutron 


VTlproton _ Tflp 
2 “ ~ 2 ~ 


(9.682) 


moving in a fixed potential V(r), where r is the proton-neutron separation. As 
a first approximation, we assume that the nuclear interaction binding the proton 
and the neutron into a deuteron is a finite square well in 3-dimensions. 

V(r) = (^° r * ° (9.683) 

10 r> a 


The physical properties of the deuteron system are: 

1. Almost an £ = 0 state (a small admixture of t = 2 is present). We will 
assume l - 0. 


2. Only one bound state exists. 

3. The depth and range of the potential is such that the deuteron is weakly 
bound. The energy level in the potential corresponds to the binding energy 
of the system, where 

E = binding energy = m. deuter 0 nC 2 - ( m prot on + m. n eutron)c 2 < 0 (9.684) 


Experimentally , it has been found that E = -2.228 MeV. 

4. By weakly bound we mean 


1^1 

(mproton + m neu t r0 n')c^ 

In fact, for the deuteron we have 

1^1 

(mp ro ton m neu tron)C 


« 1 


; 0.001 


(9.685) 


(9.686) 


5. This system is so weakly bound that any small decrease in the radius of 
the well a or small reduction in the depth of the well Vq would cause the 
system to break up (no bound state exists). 

We derived the solutions for this potential earlier. 


R(r ) = Ajo(ar) r < a 
R(r) = Bh^ (ipr) r>a 
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where 


a 2 = ~^(Vo-\E\) and I3 2 = -^\E\ (9.687) 

The transcendental equation arising from matching boundary conditions at r = a 
gave us the equations 

77 = -£ cot £ and £ 2 + rj 2 = ^ m ^o a (9.688) 

n 2, 

where 

t; = aa and 77 = /3a (9.689) 

The graphical solution as shown below in Figure 9.17 plots 

77 = -£ cot£ and rj 2 = — £ 2 versus £ (9.690) 



Figure 9.17: Deuteron Solution 


We found a finite number of bound states for given values of the well parameters. 
In particular 


2mV 0 a 2 

ft 2 


imV^a 2 
h 2 


2mVoa 2 
h 2 


(!)- 

no solution 

(ir- 

■* 1 solution 

(ir- 

-> 2 solutions 
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For the weakly bound deuteron system we expect that 


£ = | + e , £ «\ (9.691) 

i.e., the radius of the circle is barely large enough to get a single (weakly) bound 
state. 


Substituting into the transcendental equation we get 


77 = 




sin(f +e) 

cos (l) cos (e) - sin (|) sin (e) 
sin (|) cos (c) + cos (|) sin (e) 


1 7 r 
— + £ 
,2 


77 77 \ 2 

-e + (1 + — )£ 
2 4' 



Now substituting into the circle equation we get 



7T 

2 


£•+ (1 + 



2mVoa 2 

h 2 


-H 7T£ + 

4 



2mV 0 a 2 

~h 2 


Dropping the small quadratic terms in e we get 

2?nVoo 2 7T 

£ ~ - — 

nh 2 4 


(9.692) 


(9.693) 


Therefore, 

2ma 2 8 ma 2 \ irh 2 4 ) 


(9.694) 


A typical value for the range of the interaction is the order of 2 Fermi or 
a s2x 1CT 13 cm. In order to get \E\ « 2.3 MeV, we would need a well depth of 
ho ~ 42 MeV which is reasonable (according to experimentalists). 


9.6.11. The Deuteron - Another Way 

Experiment (scattering) indicates that instead of a square well (very unrealistic) 
the actual potential is of the form 

V(r) = -Ae~° (9.695) 
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where 


A * 32MeV and a « 2.25x10 ~ 13 cm (9.696) 

We can solve for this potential (£ = 0 case) exactly using this clever trick I first 
learned from Hans Bethe at Cornell University. 


The radial equation in this case is 



1 d ( 2 dR\ 2m , r . ,n 

?sr*)Ti 1 - |e| ) s ' 0 

(9.697) 


d 2 R 2 dR 2m , r . 

(9.698) 

Now let 

R(r) . 

r 

(9.699) 

which implies that 

dR d(r) 1 d'x 1 
dr dr r dr r 2 

(9.700) 


d 2 R 2 d\ 1 d?x 2 

dr 2 r 2 dr r dr 2 r 3 ^ 

(9.701) 

Substitution gives 


(9.702) 


We now change the variables using 


£ =e -^_>A = *LA = _JLA 

dr dr d£ 2 a 

d f £ d\_d£ d ( £ d \ ( £ \ 2 d 2 £ d 

dr 2 dr l 2a d^J dr d£\ 2 a dt;) 12 a) d£ 2 4 a 2 d£ 
We then get the equation 



c 2^ 2 X c dX (( \2 C 2 /1 \2\ n 

£ ^- + ^ + (( aa ) £ ~( ka ) )x = o 

(9.703) 

where 

2 2mA 2 2 2m |i?| 2 

(aa) = a and (/ca) = ———a 

h 2, h z 

(9.704) 

Now Bessel’s equation has the form 



2 <Py dy 2 2 n n 

x —- + x— + (x -Z 2 )y = 0 
ax z ax 

(9.705) 

Therefore, 

we have BesselOs equation with a general solution 



X(r) = CJ ka (aa£]) + BY ka (aa£) 

(9.706) 


R(r) = -(CUfco( aa £) + BY ka (aa£)) 

(9.707) 
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As r -*• oo, £ = e 24 -*■ 0, which implies we must choose B - 0 or the solution 
diverges. 

Asr-*0,£ = e - 2 S —1, which implies we must have 

Jka(aa) = 0 (9.708) 

or the solution diverges. This is then the energy eigenvalue condition. For the 
values of A and a given earlier we have 

J ka { 6-28) = 0 (9.709) 

and if we let \E\ = qA, this becomes 

J 1/2 ( 6.28) = 0 (9.710) 

Now 

J 6 .4 9 (6.28) = 0 (9.711) 

Therefore we have 

6.4g = - ->■ q = — - \E\ = qA = 2.34 MeV (9.712) 

2 12.8 

which is an excellent result for the bound state energy of the deuteron. 


9.6.12. Linear Potential 

We now consider a linear potential energy function given by 

V (r ) = ar - Vg 

The Schrodinger equation becomes 


h 2 I 1 d 

2 m \ r 2 dr 


e?)- 


Li 


h?7 


-ip I + (ar -V 0 - E)ip = 0 


Since it is a central potential, we can write 

ip = R(r)Y em (0,tp) 


which implies 



+ (ar -V 0 - E)R = 0 


If we let 


we get 


2m dr 2 


x(r) = rR(r) 


+ (ar -Vq - E + 


h 2 e(t +1 ) 
2 mr 2 


)x = o 


(9.713) 

(9.714) 

(9.715) 

(9.716) 

(9.717) 

(9.718) 
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For t > 0 there is no closed form solution for this equation and numerical methods 
must be applied. 


For i = 0, we have 

h 2 d 2 y 

-^7T + ( ar - F o ~E)X = 0 

2m dr z 

Now we let 

( E+ V 0 \ (2ma\ 1/3 

which implies that 

d dt; d l‘2ma^l 3 d 
dr dr dt; \ h 2 ) dt ; 


d 2 _ d 

'd£ d) 

d( d 

12 ma 


dr 2 dr 

\dr dt;) 

dr dt; \ 

{ h 2 

dt) 


2ma\ 2 / 3 d 2 
) dd 2 


(9.719) 

(9.720) 


We then get the equation 

= ° (9.721) 

This equation is not as simple as it looks. Let us try a series solution of the 
form 

x(0 = E a„C (9.722) 

n=0 

Substitution gives the relations 


a- 2 = 0 and a m+2 = - --77 (9.723) 

(m + 2)(m + 1) 

The solution that goes to zero as £ -» ±oo is then of the form 



x(£) = Cl f(0 - c 2 g(0 = CAi(C) = Airy function 

(9.724) 

where 

,, , , 1 3 1-4 6 1-4-7 9 
’ 3! 6! 9! 

(9.725) 


. , 2 4 2-5 7 2-5-8 10 

' yv ' 4! 7! 10! 

(9.726) 

Now to insure 

that the wave function is normalizable we must also have 


o 

II 

o' 

II 

<KS‘ 

(9.727) 


4 £ ;Tr)> 

(9.728) 
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This says that 


(^) 1/3 = = » tft zeroofAi(0 

Thus, the allowed energies are given by 



The first three zeroes are 

Z! = -2.3381 , 2 2 = -4.0879 , 2 3 = -5.5209 


(9.729) 


(9.730) 


(9.731) 


Programming the numerical method we described earlier for solving the Schrodinger 
equation allows us to determine E values For simplicity, we have chosen h = m = 
a - Vq = 1 which gives the i - 0 energies as 


/1 \ 1/3 

E n = -z n [-J - 1 =-0.794*, - 1 


The general t equation is (in this case) 


cp y 


i(i+1) 

2 r 2 


)x = o 


(9.732) 


(9.733) 


For £ = 0 we theoretically expect the first three energy values 0.856, 2.246 and 
3.384 and the program produces the values 0.855750, 2.24460, 3.38160 which is 
good agreement. 


We now use the linear potential to look at quark-quark bound states at low 
energies. 


9.6.13. Modified Linear Potential and Quark Bound States 

Over the past three decades the quark model of elementary particles has had 
many successes. Some experimental results are the following: 

1. Free (isolated) quarks have never been observed. 

2. At small quark separations color charge exhibits behavior similar to that 
of ordinary charge. 

3. Quark-Antiquark pairs form bound states. 

We can explain some of the features of these experimental results by describing 
the quark-quark force by an effective potential of the form 

A 

V(r) =-+ Br (9.734) 

r 
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and using non-relativistic quantum mechanics. 

The linear term corresponds to a long-range confining potential that is respon¬ 
sible for the fact that free, isolated quarks have not been observed. 

In simple terms, a linear potential of this type means that it costs more and 
more energy as one of the quarks in a bound system attempts to separate from 
the other quark. When this extra energy is twice the rest energy of a quark, 
a new quark-antiquark pair can be produced. So instead of the quark getting 
free, one of the newly created quarks joins with one of the original quarks to 
recreate the bound pair (so it looks like nothing has happened) and the other 
new quark binds with the quark attempting to get free into a new meson. We 
never see a free quark! A lot of energy has been expended, but the outcome is 
the creation of a meson rather than the appearance of free quarks. 

The other term, which resembles a Coulomb potential, reflects the fact that at 
small separations the so-called "color charge" forces behave like ordinary charge 
forces. 

The original observations of quark-quark bound states was in 1974, The ex¬ 
periments involved bound states of charmed quarks called charmonium (named 
after the similar bound states of electrons and positrons called positronium). 
The observed bound-state energy levels were as shown in Table 9.7 below: 


n 

t 

E(GeV) 

i 

0 

3.097 

2 

i 

3.492 

2 

0 

3.686 

3 

0 

4.105 

4 

0 

4.414 


Table 9.7: Observed Bound-State Energy Levels 
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The Schrodinger equation for this potential becomes 


h 2 _ 
2 m 



+ (V(r)-E)i/j = 0 


Since it is a central potential, we can write 


(9.735) 


-i/> = R(r)Y( m (d, <p) 


(9.736) 


which implies 



1 

2 m ' 

ar \ dr J r z / 

(9.737) 

If we let 


'x('f') = rR(r ) 

(9.738) 

we get 


- h3 0 

2 m dr 2 2 mr 2 

(9.739) 

or 


d 2 X , „ b . 

(clE + cr)x = 0 

ar z r z r 

(9.740) 


We must solve this system numerically. The same program as earlier works 
again with a modified potential function. 


The results for a set of parameters (a = 0.0385, b = 2.026, c = 34.65) chosen to 
get the right relationship between the levels are shown in Table 9.8 below: 


n 

e 

E(calculated) 

E(rescaled) 

i 

0 

656 

3.1 

2 

i 

838 

3.4 

2 

0 

1160 

3.6 

3 

0 

1568 

4.1 

4 

0 

1916 

4.4 


Table 9.8: Quark Model-Numerical Results 


781 




which is a reasonably good result and indicates that validity of the model. 
The rescaled values are adjusted to correspond to theTable 9.7. A more exact 
parameter search produces almost exact agreement. 


9.7. Problems 


9.7.1. Position representation wave function 

A system is found in the state 





cos ip 


(a) What are the possible values of L z that measurement will give and with 
what probabilities? 

(b) Determine the expectation value of L x in this state. 


9.7.2. Operator identities 

Show that 

(a) [a ■ L, b ■ £] = ih ( a x 6) • L holds under the assumption that a and b com¬ 
mute with each other and with L. 

(b) for any vector operator V(x,p) we have [L 2 , V] = 2 ih (V x L - ihV ). 

9.7.3. More operator identities 

Prove the identities 

(a) (d • A) (<j • t?) = A ■ B + ia '(ixfi) 

(b) e l <t>S-nlh^ e -i<t>S-nlh _ - ^ ■ a) + fix [fix a] cos (j> + [n x d] sin $ 

9.7.4. On a circle 

Consider a particle of mass /i constrained to move on a circle of radius a. Show 
that 



Solve the eigenvalue/eigenvector problem of H and interpret the degeneracy. 
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9.7.5. Rigid rotator 


A rigid rotator is immersed in a uniform magnetic field B = Boe z so that the 
Hamiltonian is 

- L 2 f 
H - — + woA z 

where uo is a constant. If 


<M| VKO)} 



<\> 


what is (0, <^> | What is (La,} at time t? 


9.7.6. A Wave Function 

A particle is described by the wave function 

ip(p, cj)) = Ae~ p t 2A cos 2 (j) 
Determine P(L Z = 0), P(L. = 2 h) and P(L Z = -2 h). 


9.7.7. L = 1 System 

Consider the following operators on a 3-dimensional Hilbert space 


i J 7 1 — _ 

( ° 1 ° ^ 
10 1 

■ T J n / — .— 

{ 0 -i 0 ^ 

\ i 0 -i 

, L z = 

O 

o o 

o o 

n/2 

{ 0 1 0 J 

’ y n/2 

{ 0 i 0 j 


l 0 0 -1 J 


(a) What are the possible values one can obtain if L z is measured? 

(b) Take the state in which L z = 1. In this state, what are (L x ), ( L 2 j and 
A L x = \J(L 2 ) - (L x ) 2 . 

(c) Find the normalized eigenstates and eigenvalues of L x in the L z basis. 

(d) If the particle is in the state with L z - -1 and L x is measured, what are 
the possible outcomes and their probabilities? 

(e) Consider the state 

in the L z basis. If L 2 is measured and a result +1 is obtained, what is 
the state after the measurement? How probable was this result? If L z is 
measured, what are the outcomes and respective probabilities? 


n/2 


l/\/2 

1/V2 

1 
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(f) A particle is in a state for which the probabilities are P(L Z = 1) = 1/4, 
P(L Z = 0) = 1/2 and P(L Z = -1) = 1/4. Convince yourself that the most 
general, normalized state with this property is 

p iS i p i6 2 p iS 3 

W = — \L Z = 1) + — \ Lz = 0) + — \L Z = -1) 

We know that if | 0) is a normalized state then the state e l6 \ip) is a phys¬ 
ically equivalent state. Does this mean that the factors e xSj multiplying 
the L z eigenstates are irrelevant? Calculate, for example, P(L X = 0). 

9.7.8. A Spin-3/2 Particle 

Consider a particle with spin angular momentum j = 3/2. The are four sublevels 
with this value of j, but different eigenvalues of j z , \ m = 3/2),| m = 1/2),| m = -1/2) 
and |to = -3/2). 

(a) Show that the raising operator in this 4-dimensional space is 

j + = h (n/ 3|3/2) (1/2| + 2|l/2) <-l/2| + V3|-l/2) <-3/2|) 

where the states have been labeled by the j z quantum number. 

(b) What is the lowering operator j- ? 

(c) What are the matrix representations of J±, J x , J y , J z and J 2 in the j z 
basis? 

(d) Check that the state 

IV>> = (n/ 3 |3/2) + 11/2) - |-l/2) - n/3 |-3/2)) 

is an eigenstate of J x with eigenvalue hj 2. 

(e) Find the eigenstate of J x with eigenvalue 3ft/2. 

(f) Suppose the particle describes the nucleus of an atom, which has a mag¬ 

netic moment described by the operator p, = gNPNj , where g jv is the 
g-factor and is the so-called nuclear magneton. At time t = 0, the 

system is prepared in the state given in (c). A magnetic field, pointing 
in the y direction of magnitude B, is suddenly turned on. What is the 
evolution of (j z ) as a function of time if 

H = - jx - B = -gNpisrhJ ■ By = -gNfiNhBJ y 

where /xjv = eft/2Afc = nuclear magneton? You will need to use the identity 
we derived earlier 

e xA Be xA = B + [A, B] x + [A, [i, B]\ y + [A, [i, [A, B]]\ ^ +. 
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9.7.9. Arbitrary directions 

Method #1 

(a) Using the \z+) and \z~) states of a spin 1/2 particle as a basis, set up 
and solve as a problem in matrix mechanics the eigenvalue/eigenvector 
problem for S n = S ■ n where the spin operator is 

S = S x e x + Sy€y + S Z 6 Z 

and 

n = sin 9 cos tpe x + sin 9 sin tpe v + cos 9e z 

(b) Show that the eigenstates may be written as 

|h+) = cos ^ | z+) + sin ^ z-} 

| n-) = sin ^ | 2 +) - e lv cos ^ | z-) 

Method #2 

This part demonstrates another way to determine the eigenstates of S n = S ■ fi. 

The operator 

R(0e v ) = e ~ i ^ e/h 

rotates spin states by an angle 9 counterclockwise about the y- axis. 

(a) Show that this rotation operator can be expressed in the form 

^ 0 2 *c • 0 

R(9e y ) = cos - - —S y sm- 

(b) Apply R to the states \z+) and | z-) to obtain the state |h+) with = 0, 
that is, rotated by angle 9 in the x — z plane. 

9.7.10. Spin state probabilities 

The z-component of the spin of an electron is measured and found to be +h/2. 

(a) If a subsequent measurement is made of the ^-component of the spin, 
what are the possible results? 

(b) What are the probabilities of finding these various results? 

(c) If the axis defining the measured spin direction makes an angle 9 with 
respect to the original 2 -axis, what are the probabilities of various possible 
results? 

(d) What is the expectation value of the spin measurement in (c)? 
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9.7.11. A spin operator 

Consider a system consisting of a spin 1/2 particle. 

(a) What are the eigenvalues and normalized eigenvectors of the operator 

Q = A§y + Bs z 

where s y and s z are spin angular momentum operators and A and B are 
real constants. 

(b) Assume that the system is in a state corresponding to the larger eigenvalue. 
What is the probability that a measurement of s y will yield the value +h/ 2? 

9.7.12. Simultaneous Measurement 

A beam of particles is subject to a simultaneous measurement of the angular 
momentum observables L 2 and L z . The measurement gives pairs of values 

(£,m) = (0,0) and (1,-1) 

with probabilities 3/4 and 1/4 respectively. 

(a) Reconstruct the state of the beam immediately before the measurements. 

(b) The particles in the beam with (£,m) = (1,-1) are separated out and 
subjected to a measurement of L x . What are the possible outcomes and 
their probabilities? 

(c) Construct the spatial wave functions of the states that could arise from 
the second measurement. 

9.7.13. Vector Operator 

Consider a vector operator V that satisfies the commutation relation 

[A, V ] = 

This is the definition of a vector operator. 

(a) Prove that the operator j s a rotation operator corresponding to 

a rotation around the x-axis by an angle ip, by showing that 

e -i V Ljh Vie i V L x /h = R^^Vj 

where Rjj(p) is the corresponding rotation matrix. 

(b) Prove that 

e~ inLx = \£,-m) 

(c) Show that a rotation by 7r around the z-axis can also be achieved by first 
rotating around the x-axis by 7r/2, then rotating around the y-axis by 7r 
and, finally rotating back by -7r/2 around the rr-axis. In terms of rotation 
operators this is expressed by 

^inL x /2hg-inLy/h,^-inL x /2h _ ^-iTzL/z/h 
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9.7.14. Addition of Angular Momentum 

Two atoms with J\ = 1 and J 2 = 2 are coupled, with an energy described by 
H = e,J\ ■ J‘ 2 , e > 0. Determine all of the energies and degeneracies for the coupled 
system. 


9.7.15. Spin = 1 system 

We now consider a spin = 1 system. 

(a) Use the spin = 1 states |1,1), 11,0) and 11, —1) (eigenstates of S z ) as a 
basis to form the matrix representation (3 x 3) of the angular momentum 
operators S x , S y , S z , S 2 , S+, and 

(b) Determine the eigenstates of S x in terms of the eigenstates 11,1}, 11,0) and 
11, -1) of S z . 

(c) A spin = 1 particle is in the state 


IV’} = 





in the S z basis. 

(1) What are the probabilities that a measurement of S z will yield the 
values h, 0, or -h for this state? What is \S z )l 

(2) What is (S x ) in this state? 

(3) What is the probability that a measurement of S x will yield the value 
h for this state? 


(d) A particle with spin = 1 has the Hamiltonian 

H = AS Z + p 2 x 

(1) Calculate the energy levels of this system. 

(2) If, at t = 0, the system is in an eigenstate of S x with eigenvalue h, 
calculate the expectation value of the spin ( Sz) at time t. 


9.7.16. Deuterium Atom 

Consider a deuterium atom (composed of a nucleus of spin = 1 and an electron). 
The electronic angular momentum is J = L + S, where L is the orbital angular 
momentum of the electron and S is its spin. The total angular momentum of 
the atom is F = J + I, where I is the nuclear spin. The eigenvalues of J 2 and 
F 2 are J( J + 1 )h 2 and F(F + l)ft 2 respectively. 
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(a) What are the possible values of the quantum numbers J and F for the 
deuterium atom in the 1 s(L = 0) ground state? 

(b) What are the possible values of the quantum numbers J and F for a 
deuterium atom in the 2 p(L = 1) excited state? 


9.7.17. Spherical Harmonics 

Consider a particle in a state described by 

if) = N(x + y + 2 z)e~ ar 


where N is a normalization factor. 

(a) Show, by rewriting the Y^ 1,0 functions in terms of x,y,z 


= I 


3 \ 1 ! 2 x± iy 


47 t) s/2r 


*-(h) 


1/2 


and r that 

z 

r 


(b) Using this result, show that for a particle described by ij> above 
P(L Z = 0) = 2/3 , P(L Z = h ) = 1/6 , P{L Z = -h) = 1/6 


9.7.18. Spin in Magnetic Field 


Suppose that we have a spin-1/2 particle interacting with a magnetic field via 
the Hamiltonian 


H = 


-jl- B , B = Be z 
-ft - B , B = Be y 


0 < t < T 
T <t <2T 


where jl = /istf and the system is initially/^ = 0) in the state 


|V’(0)) = |x+) 


1 

71 


(\z+) + \z~)) 


Determine the probability that the state of the system at t- 2 T is 


mT)) = \x+) 


in three ways: 

(1) Using the Schrodinger equation (solving differential equations) 

(2) Using the time development operator (using operator algebra) 

(3) Using the density operator formalism 
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9.7.19. What happens in the Stern-Gerlach box? 

An atom with spin = 1/2 passes through a Stern-Gerlach apparatus adjusted so 
as to transmit atoms that have their spins in the +z direction. The atom spends 
time T in a magnetic field B in the ^-direction. 

(a) At the end of this time what is the probability that the atom would pass 
through a Stern-Gerlach selector for spins in the direction? 

(b) Can this probability be made equal to one, if so, how? 

9.7.20. Spin = 1 particle in a magnetic field 

[Use the results from Problem 9.15]. A particle with intrinsic spin = 1 is placed 
in a uniform magnetic field B = Boe x . The initial spin state is |^>(0)) = |1,1). 
Take the spin Hamiltonian to be H = ojq S x and determine the probability that 
the particle is in the state \ip(t)) = |1,-1) at time t. 

9.7.21. Multiple magnetic fields 

A spin-1/2 system with magnetic moment jl = /i 0 d is located in a uniform 
time-independent magnetic field Bq in the positive 2 -direction. For the time 
interval 0 < t <T an additional uniform time-independent field B\ is applied in 
the positive x-direction. During this interval, the system is again in a uniform 
constant magnetic field, but of different magnitude and direction z' from the 
initial one. At and before t = 0, the system is in the m = 1/2 state with respect 
to the 2 -axis. 

(a) At t = 0+, what are the amplitudes for finding the system with spin pro¬ 
jections m! - 1/2 with respect to the 2 '-axis? 

(b) What is the time development of the energy eigenstates with respect to 
the 2 ' direction, during the time interval 0 <t< T1 

(c) What is the probability at t = T of observing the system in the spin state 
m = -1/2 along the original 2 -axis? [Express answers in terms of the angle 
6 between the 2 and z' axes and the frequency ujo = /-loBg/h] 

9.7.22. Neutron interferometer 

In a classic table-top experiment (neutron interferometer), a monochromatic 
neutron beam (A = 1.445) is split by Bragg reflection at point A of an interfer¬ 
ometer into two beams which are then recombined (after another reflection) at 
point D as in Figure 9.1 below: 

One beam passes through a region of transverse magnetic field of strength B 
(direction shown by lines)for a distance L. Assume that the two paths from A 
to D are identical except for the region of magnetic field. 
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Figure 9.18: Neutron Interferometer Setup 


(a) Find the explicit expression for the dependence of the intensity at point D 
on B, L and the neutron wavelength, with the neutron polarized parallel 
or anti-parallel to the magnetic field. 

(b) Show that the change in the magnetic field that produces two successive 
maxima in the counting rates is given by 


AB 


87 j 2 hc 
\e\g n XL 


where g n (= -1.91) is the neutron magnetic moment in units of -eh/2m n c. 
This calculation was a PRL publication in 1967. 


9.7.23. Magnetic Resonance 

A particle of spin 1/2 and magnetic moment /i is placed in a magnetic field 
B = Boz + Bixcosujt-Biysmujt, which is often employed in magnetic resonance 
experiments. Assume that the particle has spin up along the + 2 -axis at t = 0 
( m z = +1/2). Derive the probability to find the particle with spin down (in z = 
- 1 / 2 ) at time t > 0 . 


9.7.24. More addition of angular momentum 

Consider a system of two particles with j-\ = 2 and j 2 = 1. Determine the 
|j, states listed below in the \ji, mi,j 2 , to 2 ) basis. 

|3,3, ji, j' 2 ) , |3,2, ji,j 2 ) , |3, IJ 1 J 2 ) , |2,2, , |2,l,ji,j 2 ) , \l,l,ji,j 2 ) 


9.7.25. Clebsch-Gordan Coefficients 

Work out the Clebsch-Gordan coefficients for the combination 

3 1 

- ® - 
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9.7.26. Spin-1/2 and Density Matrices 

Let us consider the application of the density matrix formalism to the problem 
of a spin-1/2 particle in a static external magnetic field. In general, a particle 
with spin may carry a magnetic moment, oriented along the spin direction (by 
symmetry). For spin-1/2, we have that the magnetic moment (operator) is thus 
of the form: 

1 , 

Mi = 2 "MR 

where the a., are the Pauli matrices and 7 is a constant giving the strength of 
the moment, called the gyromagnetic ratio. The term in the Hamiltonian for 
such a magnetic moment in an external magnetic field, B is just: 

H = -p,-B 

The spin-1/2 particle has a spin orientation or polarization given by 

P=(a) 


Let us investigate the motion of the polarization vector in the external field. 
Recall that the expectation value of an operator may be computed from the 
density matrix according to 

( A)=Tr(pA) 

In addition the time evolution of the density matrix is given by 

Determine the time evolution dP/dt of the polarization vector. Do not make 
any assumption concerning the purity of the state. Discuss the physics involved 
in your results. 


9.7.27. System of N Spin-1/2 Particle 

Let us consider a system of N spin-1/2 particles per unit volume in thermal 
equilibrium, in an external magnetic field B. In thermal equilibrium the canon¬ 
ical distribution applies and we have the density operator given by: 



where Z is the partition function given by 

Z = Tr(e‘ m ) 


Such a system of particles will tend to orient along the magnetic field, resulting 
in a bulk magnetization (having units of magnetic moment per unit volume), 
M. 
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(a) Give an expression for this magnetization M = Nj(a/2)(donOt work too 
hard to evaluate). 

(b) What is the magnetization in the high-temperature limit, to lowest non¬ 
trivial order (this I want you to evaluate as completely as you can!)? 

9.7.28. In a coulomb field 

An electron in the Coulomb field of the proton is in the state 

|0>=g|l,O,O) + ||2,l,l) 

where the \n,£,m) are the standard energy eigenstates of hydrogen. 

(a) What is ( E) for this state? What are ( L 2 ), ( L x ) and (L x )l 

(b) What is \4>{i))l Which, if any, of the expectation values in (a) vary with 
time? 


9.7.29. Probabilities 

(a) Calculate the probability that an electron in the ground state of hydrogen 
is outside the classically allowed region (defined by the classical turning 
points)? 

(b) An electron is in the ground state of tritium, for which the nucleus is 
the isotope of hydrogen with one proton and two neutrons. A nuclear 
reaction instantaneously changes the nucleus into He 3 , which consists of 
two protons and one neutron. Calculate the probability that the electron 
remains in the ground state of the new atom. Obtain a numerical answer. 


9.7.30. What happens? 


At the time t = 0 the wave function for the hydrogen atom is 


ip(r,0) - ^2i/>ioo + ^210 + V^n + VSfoi-i) 

where the subscripts are the values of the quantum numbers ( nlm ). We ignore 
spin and any radiative transitions. 


(a) What is the expectation value of the energy in this state? 

(b) What is the probability of finding the system with < = 1, m = +1 as a 
function of time? 


(c) What is the probability of finding an electron within 10 10 cm of the proton 
(at time t = 0)? A good approximate result is acceptable. 

(d) Suppose a measurement is made which shows that L = 1, L x = +1. Deter¬ 
mine the wave function immediately after such a measurement. 
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9.7.31. Anisotropic Harmonic Oscillator 

In three dimensions, consider a particle of mass m and potential energy 

V(r) = —[(1 - t){x 2 + y 2 ) + (1 + t)z 2 ~\ 
where u> > 0 and 0 < r < 1. 

(a) What are the eigenstates of the Hamiltonian and the corresponding eigenen- 
ergies? 

(b) Calculate and discuss, as functions of r, the variation of the energy and 
the degree of degeneracy of the ground state and the first two excited 
states. 


9.7.32. Exponential potential 

Two particles, each of mass M, are attracted to each other by a potential 


where d = h/mx with me 2 = 140 MeV and Me 2 = 940 MeV. 


(a) Show that for i = 0 the radial Schrodinger equation for this system can be 
reduced to Bessel’s differential equation 


d 2 J p (x) 1 dJp(x) 

dx 2 x dx 




J P (x ) = 0 


by means of the change of variable x = ae for a suitable choice of a 
and /?. 


(b) Suppose that this system is found to have only one bound state with a 
binding energy of 2.2 MeV. Evaluate g 2 /d numerically and state its units. 

(c) What would the minimum value of g 2 Id have to be in order to have two 
t - 0 bound state (keep d and M the same). A possibly useful plot is given 
below in Figure 9.2. 
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Figure 9.19: J p (a ) contours in the a - p plane 


9.7.33. Bouncing electrons 

An electron moves above an impenetrable conducting surface. It is attracted 
toward this surface by its own image charge so that classically it bounces along 
the surface as shown in Figure 9.20 below: 

y * 


e 
x 


Figure 9.20: Bouncing electrons 
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(a) Write the Schrodinger equation for the energy eigenstates and the energy 
eigenvalues of the electron. (Call y the distance above the surface). Ignore 
inertial effects of the image. 

(b) What is the x and z dependence of the eigenstates? 

(c) What are the remaining boundary conditions? 

(d) Find the ground state and its energy? [HINT: they are closely related to 
those for the usual hydrogen atom] 

(e) What is the complete set of discrete and/or continuous energy eigenvalues? 

9.7.34. Alkali Atoms 

The alkali atoms have an electronic structure which resembles that of hydrogen. 
In particular, the spectral lines and chemical properties are largely determined 
by one electron (outside closed shells). A model for the potential in which this 
electron moves is 



Solve the Schrodinger equation and calculate the energy levels. 

9.7.35. Trapped between 

A particle of mass m is constrained to move between two concentric impermeable 
spheres of radii r = a and r - b. There is no other potential. Find the ground 
state energy and the normalized wave function. 


9.7.36. Logarithmic potential 

A particle of mass rn moves in the logarithmic potential 

V(r) = Cln\j^ 

Show that: 

(a) All the eigenstates have the same mean-squared velocity. Find this mean- 
squared velocity. Think Virial theorem! 

(b) The spacing between any two levels is independent of the mass m. 
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9.7.37. Spherical well 

A spinless particle of mass m is subject (in 3 dimensions) to a spherically sym¬ 
metric attractive square-well potential of radius r$. 

(a) What is the minimum depth of the potential needed to achieve two bound 
states of zero angular momentum? 

(b) With a potential of this depth, what are the eigenvalues of the Hamiltonian 
that belong to zero total angular momentum? Solve the transcendental 
equation where necessary. 

9.7.38. In magnetic and electric fields 

A point particle of mass m and charge q moves in spatially constant crossed 
magnetic and electric fields B = B 0 z and £ = £qX. 

(a) Solve for the complete energy spectrum. 

(b) Find the expectation value of the velocity operator 

1 _ 

^ — Pmechanical 

m 


in the state p = 0. 


9.7.39. Extra(Hidden) Dimensions 

Lorentz Invariance with Extra Dimensions 

If string theory is correct, we must entertain the possibility that space-time has 
more than four dimensions. The number of time dimensions must be kept equal 
to one - it seems very difficult, if not altogether impossible, to construct a con¬ 
sistent theory with more than one time dimension. The extra dimensions must 
therefore be spatial. 

Can we have Lorentz invariance in worlds with more than three spatial dimen¬ 
sions? The answer is yes. Lorentz invariance is a concept that admits a very 
natural generalization to space-times with additional dimensions. 

We first extend the definition of the invariant interval ds 2 to incorporate the 
additional space dimensions. In a world of five spatial dimensions, for example, 
we would write 

ds 2 = c 2 dt 2 - (da; 1 ) 2 - (da: 2 ) 2 - (dx 3 ) 2 - (da; 4 ) 2 - (da; 5 ) 2 (9.741) 

Lorentz transformations are then defined as the linear changes of coordinates 
that leave ds 2 invariant. This ensures that every inertial observer in the six¬ 
dimensional space-time will agree on the value of the speed of light. With more 
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dimensions, come more Lorentz transformations. While in four-dimensional 
space-time we have boosts in the x 1 , x 2 and x 3 directions, in this new world 
we have boosts along each of the five spatial dimensions. With three spatial 
coordinates, there are three basic spatial rotations - rotations that mix x 1 and 
x 2 , rotations that mix x 1 and x 3 , and finally rotations that mix x 2 and x 3 . 
The equality of the number of boosts and the number of rotations is a special 
feature of four-dimensional space-time. With five spatial coordinates, we have 
ten rotations, which is twice the number of boosts. 

The higher-dimensional Lorentz invariance includes the lower-dimensional one. 
If nothing happens along the extra dimensions, then the restrictions of lower¬ 
dimensional Lorentz invariance apply. This is clear from equation (9.1). For 
motion that does not involve the extra dimensions, dx 4 = dx 5 = 0, and the ex¬ 
pression for ds 2 reduces to that used in four dimensions. 

Compact Extra Dimensions 

It is possible for additional spatial dimensions to be undetected by low energy 
experiments if the dimensions are curled up into a compact space of small vol¬ 
ume. At this point let us first try to understand what a compact dimension is. 
We will focus mainly on the case of one dimension. Later we will explain why 
small compact dimensions are hard to detect. 

Consider a one-dimensional world, an infinite line, say, and let a; be a coordinate 
along this line. For each point P along the line, there is a unique real number 
x(P) called the ^-coordinate of the point P. A good coordinate on this infinite 
line satisfies two conditions: 

(1) Any two distinct points Pi + P -2 have different coordinates x(P\) + x{P 2 )- 

(2) The assignment of coordinates to points are continuous - nearby points 
have nearly equal coordinates. 

If a choice of origin is made for this infinite line, then we can use distance from 
the origin to define a good coordinate. The coordinate assigned to each point 
is the distance from that point to the origin, with sign depending upon which 
side of the origin the point lies. 

Imagine you live in a world with one spatial dimension. Suppose you are walking 
along and notice a strange pattern - the scenery repeats each time you move a 
distance 2ttR for some value of R. If you meet your friend Phil, you see that 

there are Phil clones at distances 2nR, 4 itR, 6nR, . down the line as shown 

in Figure 9.21 below. 
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2ttR 


2ttR 


x 


Figure 9.21: Multiple friends 

In fact, there are clones up the line, as well, with the same spacing. 

There is no way to distinguish an infinite line with such properties from a circle 
with circumference 2ttR. Indeed, saying that this strange line is a circle explains 
the peculiar property - there really are no Phil clones - you meet the same Phil 
again and again as you go around the circle! 

How do we express this mathematically? We can think of the circle as an open 
line with an identification, that is, we declare that points with coordinates that 
differ by 2nR are the same point. More precisely, two points are declared to be 
the same point if their coordinates differ by an integer number of 2ttR: 

Pi ~ P 2 «-»• x(Pi) = xfPf) + 2-rrRn , neZ (9.742) 

This is precise, but somewhat cumbersome, notation. With no risk of confusion, 
we can simply write 

x ~ x + 2 ttR (9.743) 

which should be read as identify any two points whose coordinates differ by 2 ttR. 
With such an identification, the open line becomes a circle. The identification 
has turned a non-compact dimension into a compact one. It may seem to you 
that a line with identifications is only a complicated way to think about a circle. 
We will se, however, that many physical problems become clearer when we view 
a compact dimension as an extended one with identifications. 

The interval 0 < x < 2 ttR is a fundamental domain for the identification (9.3) as 
shown in Figure 9.22 below. 



Figure 9.22: Fundamental domain 
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A fundamental domain is a subset of the entire space that satisfies two condi¬ 
tions: 

(1) no two points in are identified 

(2) any point in the entire space is related by the identification to some point 
in the fundamental domain 

Whenever possible, as we did here, the fundamental domain is chosen to be a 
connected region. To build the space implied by the identification, we take the 
fundamental domain together with its boundary, and implement the identifica¬ 
tions on the boundary. In our case, the fundamental domain together with its 
boundary is the segment 0 < x < 2 ttR. In this segment we identify the point 
x = 0 with the point x = 2ttR. The result is the circle. 

A circle of radius R can be represented in a two-dimensional plane as the set of 
points that are a distance R from a point called the center of the circle. Note 
that the circle obtained above has been constructed directly, without the help 
of any two-dimensional space. For our circle, there is no point, anywhere, that 
represents the center of the circle. We can still speak, figuratively, of the radius 
R of the circle, but in our case, the radius is simply the quantity which multi¬ 
plied by 27 t gives the total length of the circle. 

On the circle, the coordinate x is no longer a good coordinate. The coordinate 
x is now either multi-valued or discontinuous. This is a problem with any coor¬ 
dinate on a circle. Consider using angles to assign coordinates on the unit circle 
as shown in Figure 9.23 below. 
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Fix a reference point Q on the circle, and let O denote the center of the 
circle. To any point P on the circle we assign as a coordinate the angle 
0(P) = angle(POQ). This angle is naturally multi-valued. The reference 
point Q, for example, has 9(Q) = 0° and 9(Q) - 360°. If we force angles to 
be single-valued by restricting 0° < 9 < 360°, for example, then they become 
discontinuous. Indeed, two nearby points, Q and Q~, then have very different 
angles 9{Q) - 0°, while 9{Q~) ~ 360°. It is easier to work with multi-valued 
coordinates than it is to work with discontinuous ones. 

If we have a world with several open dimensions, then we can apply the identi¬ 
fication (9.3) to one of the dimensions, while doing nothing to the others. The 
dimension described by x turns into a circle, and the other dimensions remain 
open. It is possible, of course, to make more than one dimension compact. 

Consider the example, the (x, y ) plane, subject to two identifications, 

x ~ x + 2irR , y ~ y + 2 ttR 

It is perhaps clearer to show both coordinates simultaneously while writing the 
identifications. In that fashion, the two identifications are written as 

(x, y) ~ (a; + 2ttR, y) , (a;, y) ~ (x, y + 2ttR) (9.744) 

The first identification implies that we can restrict our attention to 0 < x < 2 ttR, 
and the second identification implies that we can restrict our attention to 
0 < y < 2 ttR. Thus, the fundamental domain can be taken to be the square 
region 0 < x,y < 2nR as shown in Figure 9.24 below. 



Figure 9.24: Fundamental domain = square 

The identifications are indicated by the dashed lines and arrowheads. To build 
the space implied by the identifications, we take the fundamental domain to¬ 
gether with its boundary, forming the full square 0 < x, y < 2 ttR, and implement 


800 





the identifications on the boundary. The vertical edges are identified because 
they correspond to points of the form (0 ,y) and (2ttR,i/), which are identified 
by the first equation (9.4). This results in the cylinder shown in Figure 9.25 
below. 


B 



Figure 9.25: Square -» cylinder 


The horizontal edges are identified because they correspond to points of the 
form (a:,0) and (x,2ttR), which are identified by the second equation in (9.4). 
The resulting space is a two-dimensional torus. 

We can visualize this process in Figure 9.26 below. 


A 



B 



Figure 9.26: 2-dimensional torus 


?or in words, the torus is visualized by taking the fundamental domain (with its 
boundary) and gluing the vertical edges as their identification demands. The 
result is first (vertical) cylinder shown above (the gluing seam is the dashed 
line). In this cylinder, however, the bottom circle and the top circle must also 
be glued, since they are nothing other than the horizontal edges of the funda¬ 
mental domain. To do this with paper, you must flatten the cylinder and then 
roll it up and glue the circles. The result looks like a flattened doughnut. With a 
flexible piece of garden hose, you could simply identify the two ends and obtain 
the familiar picture of a torus. 
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We have seen how to compactify coordinates using identifications. Some com¬ 
pact spaces are constructed in other ways. In string theory, however, compact 
spaces that arise from identifications are particularly easy to work with. 

Sometimes identifications have fixed points, points that are related to themselves 
by the identification. For example, consider the real line parameterized by the 

coordinate x and subject to the identification x - x. The point x = 0 is the 

unique fixed point of the identification. A fundamental domain can be chosen 
to be the half-line x > 0. Note that the boundary point x = 0 must be included 
in the fundamental domain. The space obtained by the above identification is 
in fact the fundamental domain x > 0. This is the simplest example of an orb- 
ifold, a space obtained by identifications that have fixed points. This orbifold 
is called an R 1 1Z 2 orbifold. Here R l stands for the (one-dimensional) real line, 
and Z 2 describes a basic property of the identification when it is viewed as the 
transformation x -»■ -x - if applied twice, it gives back the original coordinate. 

Quantum Mechanics and the Square Well 


The fundamental relation governing quantum mechanics is 

[xi,pj] = ihSij 

In three spatial dimensions the indices i and j run from 1 to 3. The general¬ 
ization of quantum mechanics to higher dimensions is straightforward. With d 
spatial dimensions, the indices simply run over the d possible values. 


To set the stage for for the analysis of small extra dimensions, let us review the 
standard quantum mechanics problem involving and infinite potential well. 

The time-independent Schrodinger equation (in one-dimension) is 
h 2 d 2 ip(x) 


2m dx 2 

In the infinite well system we have 

V(x) = 


+ V(x)ip(x) = Ei/j(x) 


[ 0 if x e (0,a) 
I 00 if x i (0, a) 


When x 6 (0,a), the Schrodinger equation becomes 

fi 2 <fip(x) 

The boundary conditions ip(0) = ip(a) = 0 give the solutions 


4>k(x) = 


2 . (kirx \ 
- sin I-I 




V a ) 


k = 1,2,., 00 
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The value k = 0 is not allowed since it would make the wave-function vanish 
everywhere. The corresponding energy values are 



Square Well with Extra Dimensions 

We now add an extra dimension to the square well problem. In addition to x, 
we include a dimension y that is curled up into a small circle of radius R. In 
other words, we make the identification 

(x,y) ~ (x,y + 2irR) 

The original dimension x has not been changed(see Figure 9.27 below). In the 
figure, on the left we have the original square well potential in one dimension. 
Here the particle lives on the the line segment shown and on the right, in the 
(x, y) plane the particle must remain in 0 < x < a. The direction y is identified 
as y ~ y + 2 ttR. 



Figure 9.27: Square well with compact hidden dimension 
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The particle lives on a cylinder, that is, since the y direction has been turned into 
a circle of circumference 2ttR, the space where the particle moves is a cylinder. 
The cylinder has a length a and a circumference 2ttR. The potential energy 
V(x, y) is given by 


JO if x e (0, a) 
1 oo if x i (0, a) 


that is, is independent of y. 


We want to investigate what happens when R is small and we only do experi¬ 
ments at low energies. Now the only length scale in the one-dimensional infinite 
well system is the size a of the segment, so small R means R « a. 

(a) Write down the Schrodinger equation for two Cartesian dimensions. 

(b) Use separation of variables to find :r—dependent and y-dependent solu¬ 
tions. 

(c) Impose appropriate boundary conditions, namely, and an infinite well in 
the x dimension and a circle in the y dimension, to determine the allowed 
values of parameters in the solutions. 

(d) Determine the allowed energy eigenvalues and their degeneracy. 

(e) Show that the new energy levels contain the old energy levels plus addi¬ 
tional levels. 

(f) Show that when R « a (a very small (compact) hidden dimension) the 
first new energy level appears at a very high energy. What are the exper¬ 
imental consequences of this result? 


9.7.40. Spin-1/2 Particle in a D-State 

A particle of spin-1/2 is in a D-state of orbital angular momentum. What 
are its possible states of total angular momentum? Suppose the single particle 
Hamiltonian is 

H = A + BL-S + CL-L 

What are the values of energy for each of the different states of total angular 
momentum in terms of the constants A, B, and C? 


9.7.41. Two Stern-Gerlach Boxes 

A beam of spin-1/2 particles traveling in the j/—direction is sent through a Stern- 
Gerlach apparatus, which is aligned in the 2 -direction, and which divides the 
incident beam into two beams with m = ±1/2. The m = 1/2 beam is allowed to 
impinge on a second Stern-Gerlach apparatus aligned along the direction given 

by 

e = sin 6x + cos Oz 
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(a) Evaluate S = (h/ 2) a ■ e, where a is represented by the Pauli matrices: 



Calculate the eigenvalues of S. 

(b) Calculate the normalized eigenvectors of S. 

(c) Calculate the intensities of the two beams which emerge from the second 
Stern-Gerlach apparatus. 


9.7.42. A Triple-Slit experiment with Electrons 

A beam of spin-1/2 particles are sent into a triple slit experiment according to 
the figure below. 



Figure 9.28: Triple-Slit Setup 

Calculate the resulting intensity pattern recorded at the detector screen. 

9.7.43. Cylindrical potential 

The Hamiltonian is given by 




where p = \Jx 1 + y 2 . 

(a) Use symmetry arguments to establish that both p z and L z , the ^-component 
of the linear and angular momentum operators, respectively, commute 
with H. 


805 




(b) Use the fact that H, p z and L z have eigenstates in common to express the 
position space eigenfunctions of the Hamiltonian in terms of those of p z 
and L z . 


(c) What is the radial equation? Remember that the Laplacian in cylindrical 
coordinates is 

2 1 d I di/j\ 1 d 2 tp d 2 il) 

V V ' = pdp \ P fy) + ^d^ + ~d^ 

A particle of mass p is in the cylindrical potential well 


V{ P ) 


0 p < a 
oo p > a 


(d) Determine the three lowest energy eigenvalues for states that also have p z 
and L z equal to zero. 

(e) Determine the three lowest energy eigenvalues for states that also have p z 
equal to zero. The states can have nonzero L z . 


9.7.44. Crazy potentials. 

(a) A nonrelativistic particle of mass m moves in the potential 

V(x , y, z ) = A(x 2 + y 2 + 2Xxy ) + B(z 2 + 2 pz) 
where A > 0, B > 0, |A| < 1. p is arbitrary. Find the energy eigenvalues. 

(b) Now consider the following modified problem with a new potential 

fV ( x , y, z) z> -p and any x and y 

V yiquj — j 

+oo 2 ; < -p and any x and y 


Find the ground state energy. 

9.7.45. Stern-Gerlach Experiment for a Spin-1 Particle 

A beam of spin-1 particles, moving along the y-axis, passes through a sequence 
of two SG devices. The first device has its magnetic field along the z-axis and 
the second device has its magnetic field along the z'-axis, which points in the 
x - z plane at an angle 6 relative to the 2 -axis. Both devices only transmit the 
uppermost beam. What fraction of the particles entering the second device will 
leave the second device? 
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9.7.46. Three Spherical Harmonics 

As we see, often we need to calculate an integral of the form 

/ dny; 3m3 ( e , ^)r, 2m2 ( e , <p)Y limi ( e , <?) 

This can be interpreted as the matrix element {(.^rn^Ym^ |^i?ni), where Ymf* 
is an irreducible tensor operator. 

(a) Use the Wigner-Eckart theorem to determine the restrictions on the quan¬ 
tum numbers so that the integral does not vanish. 

(b) Given the addition rule for Legendre polynomials: 

p tl ( n ) p ta ( M ) = £ (401 4040} 2 a 3 ( m ) 

u 

where (40 | 4040) is a Clebsch-Gordon coefficient. Use the Wigner- 
Eckart theorem to prove 

f dttY e * 3m3 (0, <p)Y i2m2 (9, <p)Y iimi ( 9 , <p) 

HINT: Consider (40| Y^ 2) |40). 

9.7.47. Spin operators ala Dirac 

Show that 

S z = ^\z+)(z +1 - ^ \z-)(z-\ 

S+ = h\z+) (z-\ i S- - h\z~) {z+\ 

9.7.48. Another spin = 1 system 

A particle is known to have spin one. Measurements of the state of the particle 
yield (S x ) = 0 = (S y ) and (S z ) = a where 0 < a < 1. What is the most general 
possibility for the state? 

9.7.49. Properties of an operator 

An operator / describing the interaction of two spin-1/2 particles has the form 
/ = a + bdi • <j 2 where a and b are constants and dj=a X jX+a y jy+a Z jZ, are Pauli 
matrix operators. The total spin angular momentum is 

]=Ji+j2 = |(di +d 2 ) 


(2t 2 + 1)(24 + 1) 
4tt(24 + 1) 


(40 | 4040} (4 m 3 
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(a) Show that /, j 2 and j z can be simultaneously measured. 

(b) Derive the matrix representation of / in the \j, m,j\.j'i) basis. 

(c) Derive the matrix representation of / in the \ji, j- 2 , 'cri \, m 2 ) basis. 

9.7.50. Simple Tensor Operators/Operations 

Given the tensor operator form of the particle coordinate operators 

(■ x,y,z ); R° = z, R* 

(the subscript "1" indicates it is a rank 1 tensor), and the analogously defined 
particle momentum rank 1 tensor P/, q = 0, ±1, calculate the commutator be¬ 
tween each of the components and show that the results can be written in the 
form 

[R'\. P" 1 ] = simple expression 

9.7.51. Rotations and Tensor Operators 

Using the rank 1 tensor coordinate operator in Problem 9.7.50, calculate the 
commutators 

[L ± ,R\] and [L Z ,R\] 

where L is the standard angular momentum operator. 

9.7.52. Spin Projection Operators 

Show that Pi = |/ + (Si ■ § 2 )/h 2 and P 0 = \l - (Si • § 2 )/h 2 project onto the 
spin-1 and spin-0 spaces in | | = 1 © 0. Start by giving a mathematical 

statement of just what must be shown. 

9.7.53. Two Spins in a magnetic Field 

The Hamiltonian of a coupled spin system in a magnetic field is given by 

H = A + J^ + B S -±^ 
h- h 

where factors of h have been tossed in to make the constants A, J, B have 
units of energy. [J is called the exchange constant and B is proportional to the 
magnetic field]. 

(a) Find the eigenvalues and eigenstates of the system when one particle has 
spin 1 and the other has spin 1/2. 

(b) Give the ordering of levels in the low field limit J » B and the high field 
limit B » J and interpret physically the result in each case. 
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9.7.54. Hydrogen d States 

Consider the l = 2 states (for some given principal quantum number n, which is 
irrelevant) of the H atom, taking into account the electron spin= 1/2 (Neglect 
nuclear spin!). 

(a) Enumerate all states in the J, M representation arising from the l = 2, 
s= 1/2 states. 

(b) Two states have rrij - M = +1/2. Identify them and write them precisely 
in terms of the product space kets \l, me; s, m s ) using the Clebsch-Gordon 
coefficients. 


9.7.55. The Rotation Operator for Spin-1/2 


We learned that the operator 


R n (Q) = e- i&(Sni)/h 

is a rotation operator, which rotates a vector about an axis e„ by and angle 0. 
For the case of spin 1/2, 

J = S= |d->i?„(0) = e“ l0 ^ /2 

(a) Show that for spin 1/2 


Rn(Q) 


COS 



I - i sin 





(b) Show R n (Q = 27r) = -J; Comment. 

(c) Consider a series of rotations. Rotate about the y -axis by 6 followed by 
a rotation about the 2 -axis by <f>. Convince yourself that this takes the 
unit vector along e z to e n . Show that up to an overall phase 

|t„) = R Z (4>)Ry\\ Z ) 


9.7.56. The Spin Singlet 

Consider the entangled state of two spins 

|4b4B) = (|tz>A ® I Us - IUa ® IU B ) 

(a) Show that (up to a phase) 

) = —^ (Itra)^ ® I-ItiIb ~ lira) A ® llra/^) 

where |t ra ), |i n ) are spin spin-up and spin-down states along the direction 
e n . Interpret this result. 

(b) Show that (’I'abI v n ® °n' I ^ ab) = ~e n ■ e n ' 
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9.7.57. A One-Dimensional Hydrogen Atom 

Consider the one-dimensional Hydrogen atom, such that the electron confined 
to the x axis experiences an attractive force e 2 /?’ 2 . 

(a) Write down Schrodinger’s equation for the electron wavefunction ip(x) and 
bring it to a convenient form by making the substitutions 

h 2 h 2 2x 

me 2 ’ 2 ma 2 a 2 ’ Z aa 

(b) Solve the Schrodinger equation for ip( z )- (You might need Mathematica, 
symmetry arguments plus some properties of the Confluent Hypergeomet¬ 
ric functions or just remember earlier work). 

(c) Find the three lowest allowed values of energy and the corresponding 
bound state wavefunctions. Plot them for suitable parameter values. 


9.7.58. Electron in Hydrogen p-orbital 

(a) Show that the solution of the Schrodinger equation for an electron in a 
p z -orbital of a hydrogen atom 


ip{r,9,<j)) = \ —R n i(r) cos 9 

47r 

is also an eigenfunction of the square of the angular momentum operator, 
L 2 , and find the corresponding eigenvalue. Use the fact that 


L 2 = - h 2 


1 8 

sin 9 89 


HI) 


sin 


1 d' 2 

9 8<t> 2 


Given that the general expression for the eigenvalue of L 2 is £{£ + 1 )h 2 , 
what is the value of the £ quantum number for this electron? 

(b) In general, for an electron with this £ quantum number, what are the 
allowed values of m{l (NOTE: you should not restrict yourself to a p z 
electron here). What are the allowed values of s and m s ? 

(c) Write down the 6 possible pairs of m s and mi values for a single elec¬ 
tron in a p-orbital. Given the Clebsch-Gordon coefficients shown in the 
table below write down all allowed coupled states | j, nij) in terms of the 
uncoupled states |To get started here are the first three: 


13/2,3/2) = |1,1/2) 

|3/2,1/2) = V / 2?3 |0,1/2) + y/l/3 11, -1/2) 

11/2,1/2) = -y/l/3 |0,1/2) + ^2/311, -1/2) 
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1 3,raj) 



mj i 

171 j 2 

|3/2,3/2) 

|3/2,l/2} 

11/2,1/2} 

|3/2, —1/2) 

11/2, -1/2) 

3/2, -3/2) 

1 

1/2 

1 






1 

-1/2 


nA/3 

n/2/3 




0 

1/2 


V2/3 

-\A/3 




0 

-1/2 




\/2/3 

\/l/3 


-1 

1/2 




n/1/3 

-V^/3 


-1 

-1/2 






1 


Table 9.9: Clebsch-Gordon coefficients for j\ = 1 and j '2 = 1/2 


(d) The spin-orbit coupling Hamiltonian, H so is given by 

H so = £,{?)£■ s 


Show that the states with | j, nij ) equal to |3/2,3/2), ]3/2,1/2} and |l/2,1/2) 
are eigenstates of the spin-orbit coupling Hamiltonian and find the cor¬ 
responding eigenvalues. Comment on which quantum numbers determine 
the spin-orbit energy. (HINT: there is a rather quick and easy way to do 
this, so if you are doing something long and tedious you might want to 
think again .). 

(e) The radial average of the spin-orbit Hamiltonian 


[ £(r)|f?„f(r)|Vdr 
Jo 

is called the spin-orbit coupling constant. It is important because it gives 
the average interaction of an electron in some orbital with its own spin. 
Given that for hydrogenic atoms 

(( ) Ze 2 1 
87 T£o m%c 2 r 3 

and that for a 2p-orbital 



2x/6 


pe 


-p/2 


(where p = Zi'/ao and ao = Aneohr/m e c 2 ) derive an expression for the 
spin-orbit coupling constant for an electron in a 2p-orbital. Comment on 
the dependence on the atomic number Z. 


(f) In the presence of a small magnetic field, B, the Hamiltonian changes by 
a small perturbation given by 


H {1) = p B B{L + 2 s z ) 
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The change in energy due to a small perturbation is given in first-order 
perturbation theory by 

E {1) = (0|ff (1) |0) 

where |0) is the unperturbed state (i.e., in this example, the state in the 
absence of the applied field). Use this expression to show that the change 
in the energies of the states in part (d) is described by 

£ (1) = HBBgjmj ( 9 . 745 ) 

and find the values of gj. We will prove the perturbation theory result in 
the Chapter 10. 

(g) Sketch an energy level diagram as a function of applied magnetic field 
increasing from B = 0 for the case where the spin-orbit interaction is 
stronger than the electron’s interaction with the magnetic field. You can 
assume that the expressions you derived above for the energy changes of 
the three states you have been considering are applicable to the other 
states. 

9.7.59. Quadrupole Moment Operators 

The quadrupole moment operators can be written as 

Q{+2) = \[\( x+iy ^ 2 
< 2 (+1) = -\j^{x + iy)z 

Q ( 0) = l(3, 2 _ r 2 ) 

Q (_1) = yj^(x-iy)z 

Q {2) = \[\ {x ~ lv)2 

Using the form of the wave function i/y m = R{r)Y^ n {6, <j>), 

(a) Calculate (03 i 3 |Q (o) |0 3>3 ) 

(b) Predict all others {ip 3 ,m'\ |03. m ) using the Wigner-Eckart theorem in 

terms of Clebsch-Gordon coefficients. 

(c) Verify them with explicit calculations for ('03,i| |'03,o)i (03,-11 |03,i) 

and (03,-2|Q (O) |03,-3)• 

Note that we leave (r 2 ) = / 0 °° r 2 drR 2 (r)r 2 as an overall constant that drops out 

from the ratios. 
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9.7.60. More Clebsch-Gordon Practice 

Add angular momenta j-\ = 3/2 and j '2 = 1 and work out all the Clebsch-Gordon 

coefficients starting from the state \j,m) = |5/2,5/2) = |3/2,3/2) ® 11,1). 

9.7.61. Spherical Harmonics Properties 

(a) Show that L+ annihilates = -\/l5/327rsin 2 de 2l< ^. 

(b) Work out all of Y™ using successive applications of L_ on Y 2 2 . 

(c) Plot the shapes of EJ™ i n 3-dimensions ( r,d,<j )) using r = y 2 m (0,(/>). 

9.7.62. Starting Point for Shell Model of Nuclei 

Consider a three-dimensional isotropic harmonic oscillator with Hamiltonian 

„ P 2 1 2-2 i. /- + - 3\ 

H =-h -mu r = hu [a ■ a -\— 

2m 2 \ 2/ 

where p = (pi,P 2 ,P 3 ), f = & = (hi,a 2 ,a 3 ). We also have the com¬ 

mutators [£i,pj] = ihSij, [xi,Xj] = 0, [pi,Pj] = 0, \ai,a,j\ = 0, [at,aj"] = 0, and 
= Sij Answer the following questions. 

(a) Clearly, the system is spherically symmetric, and hence there is a con¬ 
served angular momentum vector. Show that L = fxp commutes with the 
Hamiltonian. 

(b) Rewrite L in terms of creation and annihilation operators. 

(c) Show that |0) belongs to the t - 0 representation. It is called the IS 1 state. 

(d) Show that the operators ±a 2 ) and a 3 form spherical tensor operators. 

(e) Show that TV = 1 states, |1,1, ±1) = ^( 0 ^ ± a 2 ) |0) j\J 2 and |1,1,0) = a 3 |0), 
form the £ - 1 representation. (Notation is | N,£,m)) It is called a IP state 
because it is the first P-state. 

(f) Calculate the expectation values of the quadrupole moment Q = (3 z 2 -r 2 ) 
for N = l = 1, m = -1,0,1 states, and verify the Wigner-Eckart theorem. 

(g) There are six possible states at the N - 2 level. Construct the states 
|2 ,£,m) with definite t - 0,2 and m. They are called 2 S (because it is 
second P-state) and ID (because it is the first P-state). 

(h) How many possible states are there at the N = 3,4 levels? What i repre¬ 
sentations do they fall into? 

(i) What can you say about general TV? 
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(j) Verify that the operator II = e lna ° has the correct property as the parity 
operator by showing that nSII + = -x and IIpn + = -p. 

(k) Show that II = (-1)^ 

(l) Without calculating it explicitly, show that there are no dipole transitions 
from the 2 P state to the IP state. As we will see in Chapter 11, this 
means, show that (lP|f|2P) = 0. 

9.7.63. The Axial-Symmetric Rotor 

Consider an axially symmetric object which can rotate about any of its axes but 
is otherwise rigid and fixed. We take the axis of symmetry to be the 2 -axis, as 
shown below. 



Figure 9.29: Axially Symmetric Rotor 
The Hamiltonian for this system is 


H = 


Li + L 2 y 


21 , 


2 1 \ 


where I ± and J| are the moments of inertia about the principle axes. 

(a) Show that the energy eigenvalues and eigenfunctions are respectively 




-*))• 




What are the possible values for l and m? What are the degeneracies? 
At t = 0, the system is prepared in the state 

/ 3 j / 3 

4>e,m{t = 0 ) = a - -= \ — sin 0 cos 0 


47 T r 


4-7T 


(b) Show that the state is normalized. 
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(c) Show that 


^ rn (t = 0)=^=(-Y 1 1 (eA) + Y 1 -\8, ( / ) )) 

(d) From (c) we see that the initial state is NOT a single spherical harmonic 
(the eigenfunctions given in part (a)). Nonetheless, show that the wave- 
function is an eigenstate of H (and thus a stationary state) and find the 
energy eigenvalue. Explain this. 

(e) If one were to measure the observable L 2 (magnitude of the angular mo¬ 
mentum squared) and L z , what values could one find and with what prob¬ 
abilities? 


9.7.64. Charged Particle in 2-Dimensions 


Consider a charged particle on the x - y plane in a constant magnetic field 
B = (0,0, B ) with the Hamiltonian (assume eB > 0) 


H = 


n i + ni 


2 to 


n, 


= Pi--Ai 

c 


(a) Use the so-called symmetric gauge A = B(-y, x)/2, and simplify the Hamil¬ 
tonian using two annihilation operators a x and a y for a suitable choice of 

LO. 


(b) Further define a z = ( a x + ia y )/2 and a z = ( a x -ia y )/2 and then rewrite the 
Hamiltonian using them. General states are given in the form 


| n, m) 


/ ~ + \ n / ^ + \ m 

(qJ (%) 

\fn\ \/rn\ 


| 0 , 0 > 


starting from the ground state where a z |0,0) = a z |0,0} = 0. Show that 
they are Hamiltonian eigenstates of energies hui{2n+ 1). 


(c) For an electron, what is the excitation energy when B = 100 kGl 


(d) Work out the wave function ( x , y \ 0,0) in position space. 

(e) |0, to) are all ground states. Show that their position-space wave functions 
are given by 

1>o , m (z,z)=Nz m e- emz / 4hc 
where z = x + iy and z = x - iy. Determine N. 


(f) Plot the probability density of the wave function for m - 0,3,10 on the 
same scale (use ContourPlot or Plot3D in Mathematica). 

(g) Assuming that the system is a circle of finite radius R, show that there are 
only a finite number of ground states. Work out the number approximately 
for large R. 

(h) Show that the coherent state e^°*|0,0) represents a near-classical cy¬ 
clotron motion in position space. 
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9.7.65. Particle on a Circle Again 


A particle of mass m is allowed to move only along a circle of radius R on a 
plane, x = RcosO, y = R sin$. 

(a) Show that the Lagrangian is L - mR 2 9 2 /2 and write down the canonical 
momentum pg and the Hamiltonian. 

(b) Write down the Heisenberg equations of motion and solve them, (So far 
no representation was taken). 

(c) Write down the normalized position-space wave function ipk(&) = (0|fc) 
for the momentum eigenstates pg |fc) = hk\k) and show that only k = n e Z 
are allowed because of the requirement ip(9 + 2n) = ip(9). 

(d) Show the orthonormality 

J f' 2lV 

1Pni’m d9 = S nm 

0 


(e) Now we introduce a constant magnetic field B inside the radius r < d< R 
but no magnetic field outside r > d. Prove that the vector potential is 


(Ac) Ay) 


{B(-y,x)/2 r < d 
[Bd 2 (-y,x)/2r 2 r>d 


(9.746) 


Write the Lagrangian, derive the Hamiltonian and show that the energy 
eigenvalues are influenced by the magnetic field even though the particle 
does not see the magnetic field directly. 


9.7.66. Density Operators Redux 

(a) Find a valid density operator p for a spin-1/2 system such that 

(S X ) = (Sy) = (S z ) = 0 

Remember that for a state represented by a density operator p we have 
( O q ) - Tr[pO q ]. Your density operator should be a 2 x 2 matrix with trace 
equal to one and eigenvalues 0 < A < 1. Prove that p you find does not 
correspond to a pure state and therefore cannot be represented by a state 
vector. 


(b) 


Suppose that we perform a measurement of the projection operator P t and 
obtain a positive result. The projection postulate (reduction postulate) 
for pure states says 


|T) 


l*i> 


Pj\*) 

i/(W) 


Use this result to show that in density operator notation p = I'F) (T| maps 
to 

_ PjpPj 
Pi ~ Tr[pPi\ 
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9.7.67. Angular Momentum Redux 

(a) Define the angular momentum operators L x , L y , L z in terms of the po¬ 
sition and momentum operators. Prove the following commutation result 
for these operators: [ L x ,L y ] -ihL z . 

(b) Show that the operators L± = L x ± iL y act as raising and lowering oper¬ 
ators for the 2 component of angular momentum by first calculating the 
commutator [ L Z ,L±\. 

(c) A system is in state i(>, which is an eigenstate of the operators L 2 and L z 
with quantum numbers £ and m. Calculate the expectation values ( L x ) 
and (L 2 ). HINT: express L x in terms of L ± . 

(d) Hence show that L x and L y satisfy a general form of the uncertainty 
principle: 

((AA) 2 )((AB) 2 )>~([A,B]) 


9.7.68. Wave Function Normalizability 


The time-independent Schrodinger equation for a spherically symmetric poten¬ 
tial V(r) is 


h^\]_ d_ ( 2 <9 R\ 1 (1+1) 

2p, r 2 dr \ dr ) r 2 


(E-V)R 


where ^ = R(r)Y™(6 ,</>), so that the particle is in an eigenstate of angular 
momentum. 


Suppose -R(?’) oc r~ a and V{r) oc - r ~P near the origin. Show that a < 3/2 
is required if the wavefunction is to be normalizable, but that a < 1/2 (or 
a < (3-/?)/2 if /3 > 2) is required for the expectation value of energy to be finite. 


9.7.69. Currents 

The quantum flux density of probability is 

_ lh 

j= 7T- WVV’* - V'*V'0) 

2 m 

It is related to the probability density p - \i/j \ 2 by V • j + p = 0. 

(a) Consider the case where if) is a stationary state. Show that p and j are 
then independent of time. Show that, in one spatial dimension, j is also 
independent of position. 

(b) Consider a 3D plane wave ip = Ae lkx . What is j in this case? Give a 
physical interpretation. 
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9.7.70. Pauli Matrices and the Bloch Vector 

(a) Show that the Pauli operators 



a y ~ j. S y 



satisfy 


Tr[a z ,aj] = 25^ 


where the indices i and j can take on the values x, y or z. You will 
probably want to work with matrix representations of the operators. 


(b) Show that the Bloch vectors for a spin-1/2 degree of freedom 


s= (S x )x + (S v )y+ (S z )z 

has lengthft/2 if and only if the corresponding density operator represents 
a pure state. You may wish to make use of the fact that an arbitrary 
spin-1/2 density operator can be parameterized in the following way: 

P = 7 ; (I + {cr x )a x + {■<j y )a y + (a z )a z ) 
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Chapter 10 

Time-Independent Perturbation Theory 


10.1. Nondegenerate Case 

For most physically interesting systems, it is not possible to find simple, exact 
formulas for the energy eigenvalues and state vectors. 

In many cases, however, the real system is very similar to another system that 
we can solve exactly in closed form. 

Our procedure will then be to approximate the real system by the similar system 
and approximately calculate corrections to find the corresponding values for 
the real system. The approximation method that is most often used is called 
perturbation theory. 


10.1.1. Rayleigh-Schrodinger Perturbation Theory 

Consider the problem of finding the energies (eigenvalues) and state vectors 
(eigenvectors) for a system with a Hamiltonian H that can be written in the 
form 

H = H 0 + V (10.1) 

where we have already solved the system described by H 0 , i.e., we know that 

Ho I n) = e n | n) (10.2) 

with (m | n) = S mn (remember that the eigenvectors of a Hermitian operator 
always form a complete orthonormal set ... or we can make them so using the 
Gram-Schmidt process if degeneracy exists). 

We call this solvable system the unperturbed or zero-order system. 

We then assume that the extra term V is a small correction to Hq (that is what 
we mean by similar systems). 
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This says that the real physical system will have a solution given by 

H\N) = E n \N) (10.3) 

where the real physical state vectors | N) are only slightly different from the 
unperturbed state vectors |n) and the real physical energies E n are only slightly 
different from the unperturbed energies e n . Mathematically, we can express 
this situation by writing the perturbation in the form 

V = gU (10.4) 

where g is some small (« 1) constant factor pulled out of the correction term 
V that characterizes its strength (or effect on the system described by Hq) of 
the perturbation. 

As g -* 0, each eigenvector | N) of H must approach the corresponding eigen¬ 
vector | n) of Hq and each energy eigenvalue E n of H must approach the corre¬ 
sponding energy eigenvalue e n of Hq. 

We can guarantee that this property is true by assuming that power series ex¬ 
pansions in the small parameter g exist for all physically relevant quantities of 
the real system, i.e., 


H = H 0 + V = H 0 +gU 

(10.5) 

\N) = \n)+g\N^) + g 2 \N^) + ... 

(10.6) 

E n = £ n + gE^ + g 2 E^ + ... 

(10.7) 


where the terms llV^) and E„^ are called the i t,l -order correction to the un¬ 
perturbed or zero-order solution. This is a major assumption, that we cannot, 
in general, prove is true a priori, i.e., we cannot prove that the power series 
converge and therefore make sense. 

The usual normalization condition we might impose would be (N\N) = 1. Since 
the results of any calculation are independent of the choice of normalization (re¬ 
member the expectation value and density operator definitions all include the 
norm in the denominator), we choose instead to use the normalization condition 

{n | N) = 1 (10.8) 

which will greatly simplify our derivations and subsequent calculations. 

Substituting the power series expansion into the normalization condition we get 
(n | N) = 1 = (n | n) + g (n | N w } + g 2 („ | N (2) ) + ... (10.9) 

But since we already have assumed that (n\n) = 1, we must have 

0 = g(n\ 1V (1) } + g 2 (n | 1V (2) } + .... (10.10) 
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Now the only way for a power series to be identically zero is for the coefficient 
of each power of g to be separately equal to zero. This gives the result 


(n|7V (i) } = 0 , i= 1,2,3,4,... (10.11) 

as a direct consequence of the normalization condition, i.e., all corrections to 
the state vector are orthogonal to the unperturbed state vector. 

We now substitute all of these power series into the original energy eigenvalue 
equation for H: 

H\N)=E n \N) 

(H 0 + gU) (| n) + g |lV (1) ) + g 2 + | N {2) )....) 

= (e n + gift* + g 2 E™ + ....) (|n) + g\N™) + 5 2 + \N™) ....) 

We now multiply everything out and collect terms in a single power series in g. 
We get 

0 = (H 0 1 n) - e n \n))g° + (H 0 |iV (1) > + U \n) - e n |lV (1) } - E™ |n» g 1 
+. 

+ (H 0 \N (k) ) + U\N (k - 1) ) - e n |fV (fc) ) - E^ |AT (fe_1) ) - .... - E { n k) \n))g k 

+ . 

Since the power series is equal to zero, the coefficient of each power of g must 
be equal to zero. We get (labelling the equation by the corresponding power of 
9) 

0 th - order Ho |n) = e n \n) (10.12) 

which is just our original assumption = unperturbed solution. 

1 st -order H 0 |iV (1) ) + U\n) = e n |fV (1) ) + E^ \n) (10.13) 


k th - order H 0 |lV (fc) } + U \ Ar (fc_1) ) 

= e n |lV (fc) ) + E^ + .... + E^ k) |1V (0) ) (10.14) 


where we have used the notation | n) = 

Let us consider the 1 st - order equation. If we apply the linear functional (n\ 
we get 

(n\ H 0 |1V (1) } + (n| U \n) = (n\ e n |lV (1) } + (n\ E^ |n) (10.15) 

e n (n | N (1) ) + (n\U\n) = e n (n \ N (1) ) + E^ (n \ n) (10.16) 
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or since (n | AT^ 1 )) = 0 we get 

E^ = (n\ U\n) = l s( - order correction to the energy 

= diagonal matrix element of U in the |?r) (unperturbed) basis 

or the expectation value of U in the state | n) 

or in the n th unperturbed state 

Therefore, to first order in g we have 

E n = £ n + gE^ = £ n + g (n\U\n) 

= e n + (n\V\n) (10.17) 

where we have reabsorbed the factor g back into the original potential energy 
function. 

In the same manner, if we apply the linear functional ( n\ to the k th - order 
equation we get 

E^ =(n\U\N {k ~ 1) ) (10.18) 

This says that, if we know the correction to the eigenvector to order (fc - 1), 
then we can calculate the correction to the energy eigenvalue to order k (the 
next order). 

Now the k th - order correction to the eigenvector, is just another vector 

in the space and, hence, we can expand it as a linear combination of the |n) 
states (since they are a basis). 

17V (fc) }= £ 1771) (m I iV (fc) > (10.19) 

mtn 

The state | n) is not included because (n | N < - 1 ' > ) = 0 by our choice of normalization. 

In order to evaluate this sum, we must find an expression for the coefficients 
(m|7V( fc )}. 

This can be done by applying the linear functional (to|, in ± n to the k th - order 
equation. We get 

(TO|i7 0 |^ (fc) } + (m\ U \N (k ~ 1) ) 

= (m\ e n |iv (fc) ) + (m \ E <d) |iv (fc - 1) } + .... + ( m \E { n k) |1V (0) ) 

Em (tTI | N {k) ) + (m\ U I AT (fc - 1} ) 

= £ n (to | N {k) ) + (to | N {k ~ 1} ) +.+ E {k) (to | N {0) ) 

If we assume that £ m t £ n (we have nondegenerate levels) we get 


(m | _/V (fc) ) (10.20) 

=- ((m\U |JV (fc-1) ) - E™ (to | - .- E^ k) (to | 1V (0) )) 


822 





This formula allows us to find the k th - order correction to the eigenvector in 
terms of lower order corrections to | N) and E n as long as |?r) corresponds to a 
nondegenerate level. 

To see how this works, we will calculate the corrections to second order. 

For first order, we let k = 1 and get 

(to 11V (1) ) = — ((m\U\N w ) - (m | 7V (0) }) 

£-n ~ 

=---((to| U |?r) - E^ (to | n)) 

=--- (m\ U\n) (10.21) 

which gives 

K 1 )- E |to) (to | to)---( m\U\n) (10.22) 

mtn mtn ~ £m 

Therefore, to first order in g we have 

\N) = \n) + g = |?r) + Y |to)- (m\V\n) (10.23) 

mj=n e ra - 


We then use to calculate E \the second order correction to the energy, 
using 


E™ =(n|f/|lV (1) ) = (n|f/( Y \ m ) —~-(to| U |n) j 

\m£n — ) 


- E 

mtn 


(n\ U | m) 

£-n ~ &m 


(10.24) 


Therefore, to second order in g we have 


E n =e n+ gE^+g 2 E^ 


= e n + (n\ V | n) + Y 


(n\ V | m) 


m±n £n £m 


(10.25) 
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We then obtain the second order correction to the state vector in the same way. 


(to | iV (2) ) =-— ((to| U |^V (1) } - (to | ^V (1) } - (to | iV (0) )) 


Srt. 


n 


— (to| J7 |lV (1) }-— (n| J71 n) (to. 17V (1) } 


1 

£n. £m. 


Sn. £• 

1 


wp u»- — -m\n) 


S n £r\ 


kj=n £ n £ k 

(n\U\n) 


Sn, 


(m\ U\n) 


= E 

ktn 


(m\ U | k) (fc| U | n) (n\ U \n) (m| U \n) 


(&n £-m){,£n £fc) (&n £-m ) 2 


(10.26) 


Therefore, 


«<“>) - E E |m> 

m±n k±n 


(m\U\k) (k\U\n) 

(^•n — — £/c) 


E \ m ) 

m+n 


(n\U\n) (m\U\n) 



(10.27) 


and so on. 

An Example 

We will now do an example where we know the exact answer so that we can 
compare it to the perturbation results. 

We consider a 1-dimensional system represented by a perturbed harmonic os¬ 
cillator where 


H = H 0 + V 

(10.28) 

with 


Ho - huj(a + a+ —) -* harmonic oscillator 

(10.29) 

H 0 1 n) = £ n | n) = hu)(n + ^) |n) 

(10.30) 

In standard operator notation 


77o = ~—+■ ~kx 2 , k = rrux 2 

2 to 2 

(10.31) 

We now perturb the system with the potential energy term 


r N > 

II 

to | 

to 

?r 

A 

A 

(10.32) 

Therefore, 

H=^- + -{k+k')x 2 

2m 2 v ’ 

(10.33) 
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We still have a harmonic oscillator (with a changed spring constant). This says 
that the new energies are given by 


E n = hio(n+ -) 


(10.34) 


where 


U! = 


k + k' 


k k' k' 

mV T = W V T 


Therefore, the exact energies for the perturbed system are 


1 k' 1 

E n = hu>(n+ -) = ftwW 1 + y(n + ~j) 


For k' « k, we can expand this as 


E n = h ^n + \)\l + \ k --\{^ +... 


(10.35) 


(10.36) 


(10.37) 


which should correspond to the perturbation calculated energy calculated to 2 nd 
order in perturbation theory. We now do the perturbation calculation. 

Our earlier derivation gives 


E n = £n + (n\ V\n)+ E 


(n| V |to) 


m*n e n 


1^0 = W) + E \ m ) 


m*n 


(m| F | n) 


where 


vy 1,/ 2 1 k h . ^ ^ + \2 

y = -k ar =-(a + a ) 

2 4 mw v 


(10.38) 

(10.39) 

(10.40) 


We need to calculate this matrix element 

(m\ V \n) = - —- (m\ (a + a + ) 2 1 n) = - —- (m\ a 2 + aa + + a + a + (a + ) 2 |n) 
4 mui 4 mu> 


1 k'h 
4 mw 
l fc'fc 
4 raw 


(m| [yjn{n - 1) |n - 2) + (n + 1) |n) + n|n) + (n + l)(n+ 2) |n + 2}j 
(V n ( n - 1)<5 m,n—2 + (2n + 1)<5 

m,n + \/(n + l)(n + 2)6 m ,n+2) 

(10.41) 


where we have used 


(m | a |n) = \/n (m | n - 1) = \fnb m , n -\ 

(m\ a + | n) = Vn + 1 (m \ n + 1) = + l<5 m>n+ i 


(10.42) 

(10.43) 
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Therefore, 


. . - | , 1 k'h 1 /lfc'\ 

(n\V\n) = -- (2n+ 1) = hu(n+-) - — 

4 muj 2 \2 k ) 


and 


E 


(n\V\m) 2 /1 k'h\ 2 

n(n - 1 ) 

(n + 1 )(n + 2 ) 

e n - £m V 4 TOW / 

2huj 

( -2hui ) 


( 1 \ 

\l(k'\ 2 ] 

\n+ - 


V 2 / 

[8\k) \ 


which then gives 


in agreement with the exact result (to 2 nd order). 


To calculate the new state vector to first order we need 
1 


+ ... 


E l m > 


(to| V\n) 


1 k'h '/■■(» - 1) |n _ 2) , 1 Vh V(- » l) (n + 2) |n + 2) 


4 tow 2 hui 


4 mw 


(- 2hoj ) 


which gives 


i Ar \ , \ , 1 k'hy/n(n-l) ^ 1 k'h y/(n+l)(n + 2 ) 

W) = «) +7-7T- n - 2) - --7T- n + 2) 

4 mcj zAa; 4 mw z/ia; 

What does the new ground state wave function look like? We have 

|7V=0) = | 0 > ~ee |2> 


and 


( x \N = 0) = (x\0)-- — ^(x\2) 
4 moj Znuj 

1pN=o(x) = 1p 0 (x) ~ \ — 


Now we found earlier that 


4 tow 2hu> 

1/4 


<* I o) = i>o(x) = 


and 


(*| 2 ) = ^ 2 (*) = ( —) ( 2 —x - 1 ) 


V 47rft / 


(10.44) 


(10.45) 


(10.46) 


(10.47) 

(10.48) 

(10.49) 

(10.50) 

(10.51) 

(10.52) 

(10.53) 
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which gives 


/ mu \ 
V nh j 


1/4 

1/4 


1 1 k'h s/2 1 

e 2ft 1 + - - 


4 tow 2 hu sj2 


— . k' muk 
e 2h 1 + — 


\ h 


muk' 9 \ 

4 hk X j 


8k Ahk 

Since we are only changing the spring constant we should have 

(mu \ 1/4 lmu\ l ^l fc '\^ 8 .mail OI 

^=“ w = (^r) e “ Hff) ( 1+ tJ e ” v ‘ 


(10.54) 


f^f f 1 + ^] e ^(i4) 

\ nh J \ 8 Jfc / 

(mu\ 4 ^ 4 ( /c'\/ muk 1 2 \ _m 

(ft) ( 1 + s)l 1 'Tsr x ) e_ 

V 7r h) \ 8k Ahk ) 


\ 8k Ahk 
which agrees with the perturbation result to this order. 


(10.55) 


The perturbation theory we have developed so far breaks down if there are any 
states where 

e n = but (m\ V \n) * 0 (10.56) 

i.e., degenerate states with nonzero matrix elements of the perturbing potential 
between them. 


10.2. Degenerate Case 

We handle this case as follows. Suppose we have a group of k states 

K) , \n 2 ) , \n 3 ) , ... , |n fc ) (10.57) 

that are degenerate states of the unperturbed Hamiltonian H 0 , i.e., 

H 0 K) = e ni \rii) , i = 1,2,3,4,5,..., k (10.58) 

If (rij| V | rij) + 0 for i + j within this set, the previous perturbation formulas will 
fail because the energy denominators e n . - e n -*■ 0. 

Remember, however, that any linear combination of the degenerate states 

\ni) , | n 2 ) , |n 3 ) , ... , \n k ) (10.59) 

is also an eigenstate of Hq with the same energy e ni . 
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Therefore, if we can choose a different set of basis states (start off with a new 
set of zero-order states ) within this degenerate subspace, i.e., choose a new set 
of k orthogonal states (linear combinations of the old set of degenerate states) 

/c 

\n a ) = Y J C ai \ n i) (10.60) 

*=1 

such that we have 

(n a | V\np) = 0 for a + (3 (10.61) 

then we can use the perturbation formulas as derived earlier. 

This procedure will work because the terms with zero denominators will have 
zero numerators and if one looks at the derivation, this means that these terms 
do not even appear in the final results, i.e., the zero numerators take effect be¬ 
fore the zero denominators appear. 


This condition says that the correct choice of zero-order states within the degen¬ 
erate subspace (the set of degenerate vectors) for doing degenerate perturbation 
theory is that set which diagonalizes the matrix representation of V within each 
group of degenerate states. 


The problem of diagonalizing V within a group of k states 

K) , |n 2 ) , |n 3 ) , ... , |n fc ) 

is that of finding the eigenvectors and eigenvalues of the k x k matrix 

' (mlVim) (ni\V\n 2 ) ■ (ni|V’lrifc) ' 
(n 2 |V"|n 1 ) 

k {n k \V\ni) ■ ■ {n k \V\n k ) t 


(10.62) 


(10.63) 


We now show that if the coefficients C a i of the new zero-order states are just 
the components of the eigenvectors of this matrix, then it will be diagonalized 
in the degenerate subspace. 


Suppose that we represent the eigenvector by the column vector 

C al 

I V C a 2 

I n Q ) = 

\ C a k ) 


(10.64) 


then the statement that |n Q ) is an eigenvector of the k x k submatrix of V with 
eigenvalues that we write as EnJ is equivalent to writing 

V\n a ) = Eg |n«) (10.65) 
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or 


(ni\V\ni) (n±\ V |n 2 ) 
(n 2 \V\ni) 

{n k \V\ni) 


(ni\V\n k ) 
{n k \ V\n k ) 


C a i 
C a 2 

C a k 



Cal 

C a 2 
C ak 


( 10 . 66 ) 


or finally 

E (rij \V I m)Cai = E^Cai (10.67) 

i 

All of these calculations take place within the k-dimensional degenerate subspace. 


We will assume that the eigenvectors are normalized, which implies that 

E \C ai \ 2 = 1 (10.68) 

i 

i.e., vectors are normalized to one. 


Now consider another of the new vectors given by 

/u 

M = E c pj K> (io. 69 ) 

3 =1 

We then have 

k 

M = EC'wKI (10-70) 

3 =1 

Applying the linear functional (n^| to the eigenvector/eigenvalue equation we 
get 

E E K-l C^VCai k> = Eg E C^Cai (10.71) 

3 * 1 

Now, since the eigenvectors of any Hermitian matrix are always a complete 
orthonormal set (or can always be made so using the Gram-Schmidt process), 
the orthonormality of the new vectors says that 


E CpjCai = {np 1 n a ) = 8Pa 

3 

(10.72) 

Therefore the vectors 

k 

l^a) = ^od | Tli) 

i= 1 

(10.73) 

satisfy 


( E Kl Cpj )v(e^„ In,)) = Eg Sap 

\ 3 f t 

(10.74) 

{np\V\n a ) = Eg 8 a p 

(10.75) 


829 



This says that the corresponding eigenvalue of one of the new vectors is 
the first order energy corrections for the state | N a ). The states 


I n a ) , | np) , | n 7 ) , , \n K ) (10.76) 

are called the new zeroth-order state vectors. 

Thus the group of states 


K) , l«2> , 1^3} , , I rife) (10.77) 

in the presence of the perturbation V split (rearrange) into the k states 

I n a ) , |n/3) , |n 7 ) , ... , | n K ) (10.78) 

which are given to first order by 

| m) (m\ V | n a ) 


\N a ) = I n a ) + Y. 

and the energy shift to second order is 


m±a,f},..,K e ni e m 


V\n a ) 


E na £ni (^a| V |^a) 


where 


(n a \V\n a ) = 

is an eigenvalue of the V matrix in the degenerate subspace. 


(10.79) 


(10.80) 

(10.81) 


An Example 

We now consider a 2-dimensional oscillator that is perturbed by a potential of 
the form 


V = Xxy 

(10.82) 

We then have 


H = H 0 + V 

(10.83) 

where 

j T P x Py 1,2 2\ 

(10.84) 

As we showed earlier, using the a x and a y operators we get 


Ho = hu>(a+a x + cLyd y + 1) 

(10.85) 

Ho\n x ,n y ) = e„ x ,„ y \n x ,n y ) = hu{n x + n y + 1) | n x ,n y ) 

(10.86) 

degeneracy = n x + n y + 1 

(10.87) 


830 



and 


V = A-- (a x + a x )(a y + at) 


( 10 . 88 ) 


2w v '‘ _ 9 v> 

The unperturbed ground-state is |0,0) with £o,o = bio. It is a nondegenerate 
level, so we can apply the standard perturbation theory to get 


Now 


and 


(0,0|V|0,0) = 


£ 0 ,o + (0,0 V 0,0) + X) ■ 

(Tn^ n\ V |0,0) 

(10.89) 

^■0,0 — £m,n 

mt 0 
nt 0 



= 2moj (0,0 (a x + a x )(a y 

+ by) 0,0) = 0 

(10.90) 


A h 


{m, n\ V |0,0) = —— (to, n\ (a x + a+)(a y + a + ) |0,0) 

2 muj y 

(to, n\ a+a y |0,0) = (m, n\ 1,1) 


2 mu 
A h 


2 mui 




(10.91) 


2 tow 

Thus, the correction to first order is zero. Calculating to second order we get 

E °- huJ ~(iL) (1092) 

The next unperturbed level is 2-fold degenerate, i.e., 

n x = 0, n y = 1 -*■ £o,i = 
n x = l,n y = 0 -+ £1,0 = 

For ease of notation we will sometimes denote 

|1,0) = |a) and |0,1) = |6) (10.93) 

We now use degenerate perturbation theory. 


The procedure is to evaluate the V matrix in the 2x2 degenerate subspace, 
diagonalize it and obtain the first order corrections. 


The 2x2 matrix is 

Y ( Vaa Kb ) ( (a\V \a) (a\ V \b) \ 

{ Vba Vbb ) { ( b\v\a) (b\ V \b) ) 

Now 

Vaa = ( 1 , 0 | V | 1 , 0 > = 0 = ( 0 , 1 | V | 0 , 1 ) = V bb 

Vab = Vba = ( 1 , 0 | V |0, 1 > = ( 1 ,0 | a + x a y |0, 1 } 

2moj 

A h 

2 TOW 


(10.94) 

(10.95) 


(10.96) 
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Therefore the 2x2 submatrix is 



This is simple to diagonalize. We get these results 


(10.97) 


|a'> = -^(|a> + |6» 

|&') = 4(l ffl H 6 » 




eigenvalue 


eigenvalue 


A h 

— H- 

2 mco 
A h 
2 mu 


(10.98) 

(10.99) 


la') and | b') are the new zeroth order state vectors (eigenvectors of the 2x2 
submatrix) and 


A h 

±- 

2 mui 

are the corresponding first order energy corrections. 


( 10 . 100 ) 


Thus, the 2-fold degenerate level splits into two nondegenerate levels as shown 
in Figure 10.1 below. 


_ i 

2hw --A'. 

t 

\ 2mw 

- t 


Figure 10.1: Splitting of a degenerate Level 


where 


E a , = 2huj - (10.101) 

2nuv 

E b , = 2hu+ (10.102) 

2muj 

A E = levelsplitting = (10.103) 

mui 

Another Example 

Now let us consider a system of two spin-1/2 particles in a magnetic field. We 
also assume that there exists a direct spin-spin interaction so that the Hamilto¬ 
nian takes the form 


H = ( aS hop + f3S 2 , op ) ■ B + 7<§i, 0 p • S 2 , op (10.104) 
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If we choose B = Bz and 7 » a, f3, then we can write 





H = H 0 + V 

(10.105) 




Hq = 'jSi’Op • 02,op 

(10.106) 




V = aBSi z + (3BS 2z 

(10.107) 

We define 





Sop $l,op 

+ <§ 2 , 

op = total spin angular momentum 

(10.108) 

and then we 

have 





C 2 

^ op 

— B 0 p 

' S 0 p = ($1 ,op $2,op) ' (*$l,op + $2,op) 





= *^l,op S 2o p 25i, op * S 2 ,op 





= h 2 i+h 2 i+2s l , op -s 2 , 0 p 

(10.109) 

or 







H 0 - 

= 7^i,o P -4op=|(5 0 2 p -^ 2 /) 

(10.110) 


Our earlier discussion of the addition of angular momentum says that when we 
add two spin-1/2 angular momenta we get the resultant total angular momen¬ 
tum values 0 and 1, i.e., 

i®i=0©l (10.111) 

Each separate spin-1/2 system has the eigenvectors/eigenvalues 

§i,op I*) = > Siz |±) = ±- |±) (10.112) 

The corresponding direct-product states are 

I++) > l + _ ) y l _ + ) > I ) (10.113) 


where the symbols mean 

l+-> = l + > 1 h>2 

and so on. 


(10.114) 


The total angular momentum states are (we derived them earlier) labeled as 
|s, to) where 

S 2 p |s, to) = h 2 s(s + 1) |s, to) 

S z |s, to) = ( Siz + Siz) |s, to) = ±mh |s, to) 
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(10.115) 

(10.116) 



They are given in terms of the direct product states by 


|i,i) = M-U> 

(10.117) 

|1,0) = 71 i+ “ ) + 71 i “ +) = |2> 

(10.118) 

>-|3) 

(10.119) 

|0 ’ 0) = 71 i+ “ ) ' 71 i “ +, " |4) 

( 10 . 120 ) 

The total angular momentum states are eigenstates of Ho 

and we use them as 

the unperturbed or zero-order states. 



(10.121) 

^o|1,0> = ^|1,0> = £ 2 |1,0> 

(10.122) 


(10.123) 

Ho |0,0) = |0,0) = £410,0) 

(10.124) 

We thus have one nondegenerate level and one 3-fold degenerate level. Now 

using 


V = aBSi z + /3BS 2z 

(10.125) 


we do perturbation theory on these levels. 


Nondegenerate Level 

First order: 

d' > = <4|V|4) 

= (yi <H+ 7 i (_t| ) {aBBu * ' 3B * 2 * ) (75 M _ 71 h+> ) ' 0 

Second order: 

^(2) v- |HV|4>| 2 \{++\aBS lz + f3BS 2z (^ |+-> - ^ |-+})| 

^4 ~ 2 ^ _ _ 

m± 4 ^4 £m &4 £l 

l( A 7-i + A <-+0 ^ (v m - j, i-u)| 2 

£4 - £2 

£4 - £3 
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E^ - 

^ 4 “ 


(i(*+AH) * PBS,: 


£4 - £2 


B 2 (a-/3) 2 

7 

Therefore the energy to second order for the non-degenerate level is 

Ej-^ + Ef’+d 2 ’--— 

4 4 4 7 


(10.126) 


Degenerate Level 

In this case, the 3x3 degenerate submatrix of V is 
1 (1|^|1> <1|V-|2> (1| V |3> \ 


(2|V|1> <2| V|2> (2| V|3) 
(3|V-|1> (3| V |2> (3| V |3> 


(a + f3)h.B 


1 0 0 

0 0 0 

0 0 - 1 , 


(10.127) 


which is already diagonal. Since the diagonal elements are the first order energy 
corrections, we have (to first order) 


7 ft 2 ( a + /3)hB 

Fj 1 =- 1 - 

4 2 

7/i 2 

~T~ 

7 h 2 (a + /3)hB 
~4 2 


E 2 - 
E 3 = 


(10.128) 

(10.129) 

(10.130) 


Exact Solution 

We can, in fact, solve this problem exactly and compare it to the perturbation 
result. We do this by choosing a new basis set (arbitrary choice made to simplify 
calculations) and rewriting H in terms of operators appropriate to the basis 
choice(that is what is meant by simplify calculations). 

We then use the new basis to construct the 4x4 H matrix and then diagonalize 
the matrix. This method always works for a system with a small number of 
states. 

Choose the direct product states as a basis 

I++> = |1> , l+-> = |2> , |-+> = |3> , |— > = |4) (10.131) 

Write H as (choose operators appropriate (easy to calculate) to the basis or the 
HOME space) 

H = aBSi z + (3BS 2z + 7 (Si z S 2z + S\ x S 2x + S\ y S 2 y) 

= aBS\ z + j3BS 2z + 7 ^ {^ 1 +^ 2 - + Si-S 2+ ) j (10.132) 
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Construct the 4x4 H matrix 


/ (1| H |1> <1| H |2) <1| H |3) <1| H |4) \ 
(2\H\1) <2| H |2> (2| H |3> (2|F|4) 

<3| H |1> <3|ff|2> (3| H |3) <3|F|4> 

l (4| H |1> (4| H |2> <4|tf|3) (4|ff|4) / 


(10.133) 


using 


^i±> = ±-i±> 

S + |+> = 0 = S- I-) 

S+ |-> = h\+) and S- |+) = h\~) 


(10.134) 

(10.135) 

(10.136) 

(10.137) 


We get 


( Bh(a + /3) + 0 

0 Bh(a - P) - 

0 


0 


2|i _ Bh{a _ p) _if 

0 0 


2 


0 

-Bh(a + /3) + ^f j 


Diagonalizing to get the eigenvalues we find the exact energies 

E 1 = + Bh(a + p) , E 2 = -^- + iv/7 2 ft 4 + 4 s 2 ft 2 (a-/3 ) 2 


£3 = - ^ - ^x/t 2 ^ 4 + 4B 2 ft 2 (a - / 3) 2 , £ 4 = T±--Bh(a + P) 

To compare to the perturbation calculation we let B -* 0 and we get the ap¬ 
proximation 

Ea __2£_ ma + n 

47 4 


7ft. 2 


which agrees with the perturbation results. 


10.2.1. More Ideas about Perturbation Methods 

The main problem with Rayleigh-Schrodinger perturbation theory ( RSPT ) is 
that the form of the higher order terms becomes increasingly complex and, 
hence, the series is difficult to evaluate. 
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The Brillouin-Wigner Method 

This technique allows us to see the higher order structure of the perturbation 
series more clearly. 

Consider the energy eigenvalue equation 

H\N) = E n \N) = (H 0 + gU)\N) (10.138) 

Applying the linear functional (m\ we get 

(m\ H |iV) = E n (in \ N) = (m\ Hq |1V) + g (m\ U |1V) 

E n (to | N) = s m (m \ N) + g (m\ U |iV) 

(E n -e m )(m\N)=g(rn\U\N) (10.139) 

We will use the normalization (n | N) = 1 once again. 

Now since the | in) states are a complete orthonormal basis we can always write 

\ N ) = E \ m ) < m I N ) = l n > ( n I N ) + E |to) (to | N) 

m mtn 

= |n) + E \m)(m\N) (10.140) 

mtn 

Using the results above (10.139) we get 

|JV) = |n) + E \ m ) - ■ g — {m\U\N) (10.141) 

m±n E n — £ m 

We now develop a series expansion of \N) in powers of g as follows: 

0 th - order: 

\N) = |n) 

I s * - order : ( substitute 0 th - order result for | N) into general formula (10.141)) 

l^) = |n>+ E 9 _ (to| U\n) 

This is not the same result as in RSPT since the full energy E n remains in the 
denominator. 


2 nd - order : ( substitute 1 st -order result for | N) into general formula (10.141)) 


l^) = |n)+ E \ m ) F 9 „ (to| U I |n) + E \j) F 

\ j±n E n £j 


9 


= I n)+ E \ m ) 


m±n 
2 


En £m 

9 

En ~ £m 


U\U\n) 


(m\ U | n) 


g 2 E EN p g . 01 


m+n jtn 


En, £n 


E n Sj 
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and so on. 


This is a complex power series in g since the full energy E n remains in the 
denominator. 


If we let |to) = |n) in (10.139) we get 

E n (n\ N) = s n (n\ N) + g (n\ U\N) (10.142) 

E n = e n +g(n\U\N) (10.143) 

This can be expanded to give E n as a series in powers of g, i.e., substituting 
the 0 th - order approximation for | N) gives the 1 st - order approximation for 
E n and substituting the 1 st - order approximation for \N) gives the 2 nd - order 
approximation for E n and so on. 

If we substitute the I s * - order approximation for \N) we get 
E n = £ n + g{n\U\N) 

= e n + g(n\ul\n) + £ |m)—(m| U\n) ) 

= £n + g (n| U |n> + g 2 ^ (10.144) 

mi=n E n £rn 

which is the second-order energy. 

So the BWPT and the RSPT agree at each order of perturbation theory, as 
they must. The structure of the equations, however, is very different. 


A simple example shows the very different properties of the two methods. 
Consider the Hamiltonian given by H = Hq + V where 

E\ 0 


Ha = 


0 £ 2 


and 


j -*■ eigenvectors |1) = | ^ jand|2) = | ^ j 

) 


(10.145) 


0 a 


a* 0 

The exact energy eigenvalues are obtained by diagonalizing the H matrix 

E\ Oi 


(10.146) 


H = 


to get the characteristic equation 


/ £i a \ 

\ a* e 2 ) 


clet 


£\ - E a 
a* £2 - E 


- 0 - E 2 - (£1 + £ 2 )E + (£i£ 2 — |a|“) 


(10.147) 


(10.148) 
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which has solutions 


Ei - -(ei + e 2 ) + -\f (ffi - S 2) 2 + 4 |a| (10.149) 

E 2 = ^(ei +£ 2 ) - ^\/(si -e 2 ) 2 +4|a| 2 (10.150) 

In the degenerate limit , £1 = £2 = £, we have the exact solutions 

Ei-e + \a\ and E 2 = £ - |a| (10.151) 

Now, BWPT gives 


E n = e n +(n\V\N) 


/ 1 t > 1 \ v- \( m \V\n ) 2 

= £ n + {n\V\n)+ Y, p 

mtn £m 

(10.152) 

or 


|<2| W |1)| 2 U| 2 

Ei=e n + (1| k" |1> + =£1 + ~J~T 

(10.153) 

Rearranging we get 


Ei - (£1 + £ 2 )Ei + (ei£ 2 - |a| ) = 0 

(10.154) 

which is the same eigenvalue equation as the exact solution. In the degenerate 

limit, £\ = £2 = £, we have 


E\ = £ + |o;| 

(10.155) 


or BWPT gives the exact answer for this simple system, even in the degenerate 
case. 


On the other hand, RSPT gives to second-order 

E 1 =e 1 + (1\V\1) + -^— (10.156) 

£1 -£2 

This is equivalent to the exact formula to second order only! 

In the degenerate limit we get nonsense since the denominator vanishes. As 
we saw earlier, RSPT requires an entirely different procedure in the degenerate 
case. 

Notice that RSPT is in trouble even if £1 >» £ 2 which implies that 

Id 2 

- is very large (10.157) 

£1 -£2 

and thus, that the perturbation expansion makes no sense (the terms are sup¬ 
posed to get smaller!). 

A clever trick for handling these almost degenerate cases using RSPT goes as 
follows. 
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Almost Degenerate Perturbation Theory 


Given H = H 0 + V, suppose, as in the last example, we have two states | n) and 
| m) of the unperturbed (zero order) Hamiltonian Hq that have energies that are 
approximately equal. 

This is a troublesome situation for RSPT because it is an expansion that includes 
increasing numbers of 

--- (10.158) 

£-n ~ ^m 

terms. This implies that successive terms in the perturbation series might de¬ 
crease slowly or not at all. 

To develop a more rapidly converging perturbation expansion we rearrange the 
calculation as follows. We use the definition of the identity operator in terms of 
projection operators to write 

V = IVi = Y / \i)(i\V\j)(j\ (10.159) 

i,j 

We then break up V into two parts 

V = V 1+ V 2 (10.160) 

where we separate out the in and n terms into v± 

V\ = |to) (m\ V | m) (?n| + \m) (m\ V \n) (n\ 

+ | n) (n\V\m) (m\ + \n) (n\ V \ n) (n\ (10.161) 

and V 2 = the rest of the terms. We then write 

H = H {) + V = H 0 + V, +V 2 = Hi, + Vj (10.162) 

This new procedure then finds exact eigenvectors/eigenvalues of H' 0 and treats 
v 2 by ordinary perturbation theory. 

Since the basis is orthonormal, we have from the definition of v 2 , i.e., 

t>2= E \i)(i\V\j)(j\ (10.163) 

i,j±m,n 


which gives 

0 = (n| V 2 1 n) = (n\ V 2 1to) = (m\ V 2 |n) = (m\ V 2 1 m) (10.164) 

Thus, the closeness of the levels e n and e m will not prevent us from applying 
standard perturbation theory to V 2 , i.e., the numerators of terms with very 
small energy denominators, which might cause the series to diverge, all vanish 
identically! 
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Now if \i) is an eigenvector of Ho (not | m) or |n)), then it is also an eigenvector 
of Hq since, by the orthonormality condition, 

Vi |i> = 0 (10.165) 

Neither | m) nor | n) is an eigenvector of Hq however. 


Now, the Hq matrix is diagonal since we are using its eigenvectors as a basis. 
The Hq matrix is diagonal also except for the 2x2 submatrix 


/ (m\H'o\m) (m\H'o\n) \ 

\ (n\ H'o \m) {n\H' 0 \n) ) 


(10.166) 


Therefore, we can finish the solution of the problem of Hq by diagonalizing this 
2x2 matrix. 


Diagonalizing the 2x2 matrix is equivalent to finding the linear combinations 
(or new zero order eigenvectors) 

a\n) + (3\m) (10.167) 


that diagonalize the 2x2 matrix. 

We must have 

Hq (a| n) + j3\ to}) = (H 0 + Vi) (a |n) + j3\m)) = E' (a\n) + /31m)) (10.168) 

Now 

Hq | n) = H 0 | n) + Vi \n ) 

= e n | n) + |to) (m\ V |to) (to | n) + \m) (m\ V \n) (n \ n) 

+ | n) (n\ V |to) (to- | n) + \n) (n\ V \n) (n \ n) 

= e n |n) + |to) (to| V | n) + |n) (n\ V \n) 

= (e ra + (n\ V |n)) | n) + \m) (m\ V \n) 

= \n) + (m\V \n) \m) (10.169) 

and similarly 

Hq |to) = E^ |to) + (n\ V |to) | n) (10.170) 

Therefore, we get 

a ( E^ | n) + (m\ V \n) \ to}) + /? (E^ \ to) + (n| V |m) |n)) = E' (a |n) + (3 \ to)) 

(10.171) 

Since the state vectors |to) and |n) are orthogonal, we must then have 

+ (n\ V |to) (3 - E'a 
(m\ V \n) a + E^}(3 = E'(3 
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(10.172) 

(10.173) 



These equations have two solutions, namely, 


a = (n\ V |to) 

E^ - 

which then give the results 


' - E ^ 

J - J m J - J n 


+ (n\V\m) 


E ± = 


e £> + B * 11 


\ 


' jt^ 1 ) _ 

- Ly n 


+ (n\ V |to) 


(10.174) 

(10.175) 


(10.176) 


We then know all the eigenvectors/eigenvalues of H' 0 (we know all of the unper¬ 
turbed states) and we can deal with V 2 by perturbation theory. 

Finally, let us introduce another interesting idea. 


Fake Degenerate Perturbation Theory 

Consider the problem of finding the energy eigenvalues and state vectors for 
a system with a Hamiltonian H = H 0 + V where we know the solution to the 
zero-order system 

H 0 \n)=e n \n) (10.177) 

We will assume that the unperturbed states are nondegenerate. 



1 N 

Eaverage = -^av = ~TZ 

^ n= 1 

(10.178) 

and redefine 

H = E av I + U 

(10.179) 

where 

U = Ho ~ E av I + V 

(10.180) 

If the energies associated with U are small corrections to E av , then we can use 
degenerate perturbation theory to solve this problem, i.e., the new unperturbed 
Hamiltonian is 


H'o = E a J 

(10.181) 


and all of its levels are degenerate in zero order. 


The problem is then solved by diagonalizing the U matrix in the basis of Hq 
states. 
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10.2.2. Thoughts on Degeneracy and Position Representa¬ 
tion 

When we derived the energy spectrum of the hydrogen atom we found that the 
states were labeled by three quantum numbers 

\ip) = | nim) (10.182) 


where 

n = the radial quantum number 

l - orbital angular momentum quantum number 

m = z - component of orbital angular momentum quantum number 

and we found that 


n = 1,2,3,. 

£ = 0,1,2,. ,n- 1 for a given value of n 

m = —t, -l + 1,. £ - 1,1 for a given value of £ 


The energy eigenvalues, however, did not depend on l or m. We found that 

e 2 

E n im = E n = --o (10.183) 

2aon. 

Therefore, each energy level had a degeneracy given by 


n— 1 t n -1 n— 1 n— 1 

3= I E 1=E(2^+1) = 2E^+ El 

1=0 m=-i £=0 £=0 1=0 

^n(n-l) o 

= 2—- L+ n = n 2 

2 


(10.184) 


The degeneracy with respect to m is understandable since no direction is explic¬ 
itly preferred in the Hamiltonian. We expect that this degeneracy will disappear 
as soon as a preferred direction is added to the Hamiltonian, as in the case of 
external electric(Stark effect) or magnetic (Zeeman effect) fields. 


The degeneracy with respect to t is a property peculiar to the pure 1/r Coulomb 
potential. Since no other atom except hydrogen has a pure Coulomb potential, 
we expect this degeneracy to vanish in other atoms. 


Such a degeneracy is called an accidental degeneracy. 


Now the electron and proton making up the hydrogen atom also have spin angu¬ 
lar momentum. The presence of these extra(internal) degrees of freedom should 
change the Hamiltonian. 
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The Schrodinger equation was derived from the eigenvalue equation for the 
Hamiltonian 

H\ip) = E\ip) (10.185) 

by re-expressing that equation in the position representation. The associated 
Schrodinger wave functions were given by the scalar product (linear functional) 
relation 

ip if) = (r | ip) (10.186) 

The single particle Schrodinger equation is relevant for problems where the 
Hamiltonian contains terms dependent on ordinary 3-dimensional space(for 
many-particle systems we must use a multi-dimensional configuration space 
which bears no simple relationship to ordinary three-dimensional space). Spin 
is an internal degree of freedom that has no representation in the 3-dimensional 
space of the Schrodinger wave equation. 

The Schrodinger picture, however, does not choose a particular representation 
and, therefore, we can include spin within the context of solving the Schrodinger 
equation in the following ad hoc manner. A more rigorous treatment requires 
relativity. 

If there are spin-dependent terms in the Hamiltonian, then we expand the 
Hilbert space used to solved the problem by constructing a new basis that is 
made up of direct product states of the following type 

\ipnew) = \ Ip)®\s,m s ) (10.187) 

where \ip) depends on only ordinary 3-dimensional space and |s,m s ) is an eigen¬ 
vector of Sg p and S z . 

The energy eigenvalue equation becomes 

H\ip new )=m 

new ) 

= (((3 - space operators)) | ip)) ® (((spin - dependent operators)) |s,m s )) 

and the corresponding wave function is 

(r | ipnew) = (f | ip)\s,m s ) = ip(f) \s,m 3 ) (10.188) 

where abstract spin vector is stuck onto the wave function in some way (maybe 
with superglue). 

Let us now investigate what happens in atoms when we add in spin, some aspects 
of relativity and external fields. We restrict our attention to one-electron atoms 
like hydrogen at this point. 
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10.3. Spin-Orbit Interaction - Fine Structure 

The proton in hydrogen generates an electric field 

£=4 (10.189) 

that acts on the moving electron. This result is approximately true (to first 
order) in most atoms. Now special relativity says that an electron moving with 
a velocity v through an electric field £ also behaves as if it is interacting with a 
magnetic field given by 

B = --vx£ (10.190) 

c 

to first order in v/c. 


This magnetic field interacts with the spin (actually with its associated magnetic 
moment) to produce an additional contribution to the energy of the form 


E = -M spin ■ B 


where 


Mspin B 


me 


Substituting everything in we get 


E = —■ (v x £) = — ~^B- (v x 4-r) 
mc z mc z \ r 3 J 

1 




(10.191) 

(10.192) 


(10.193) 


Now 

e 2 1 dV e 2 

— = —— for V(r) =-= potential energy of the electron (10.194) 

r 6 r dr r 

so that we finally obtain the so-called spin-orbit energy contribution 


E = 


1 1 dV 


m 2 c 2 r dr \ 


S ■ L - E 


spin-orbit 


= E„ 


(10.195) 


This corresponds to an additional term in the Hamiltonian of the form 


1 1 dV 


Bop' E 0 p 


(10.196) 


This term couples the orbital and spin angular momentum degrees of freedom 
(hence the label spin-orbit energy) and mixes 3-dimensional space with spin 
space. That is why we had to expand the Hilbert space as we discussed earlier. 


Another way to think about this interaction is that the electron spin magnetic 
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moment vector (or spin vector) is precessing about the direction of the magnetic 
field. The equations for such a precessional motion are 


dS 

M 3pin X B — -j-j- — Qlj(armor) * & 


(10.197) 


where 


Now 


which implies that 



B = —v x £ 
c 


1 1 dV L 

emc 2 r dr 


firJ = 




(10.198) 


(10.199) 


( 10 . 200 ) 


It turns out that this is exactly a factor of 2 too large. There is another rela¬ 
tivistic effect, which gives another precession (called Thomas precession) effect, 
that cancels exactly one-half of this spin-orbit effect. 


10.3.1. Thomas Precession 

This is a relativistic kinematic effect. It results from the time dilation between 
the rest frames of the electron and the proton. This causes observers in these 
two frames to disagree on the time required for one of the particles to a make a 
complete revolution about the other particle. 


If an observer on the electron measures a time interval T, then the observer on 
the proton measures 


T' = yT where y = 


1 


, v = speed of the electron (10.201) 


nA 7 ? 

We assume uniform circular motion for simplicity. 

The orbital angular velocities measured by the observers are 

27T 27T 

— and — 

T T' 


( 10 . 202 ) 


respectively. 

In the rest frame of the electron, the spin angular momentum vector maintains 
its direction in space. This implies that an observer on the proton sees this spin 
vector precessing at a rate equal to the difference of the two angular velocities, 
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i.e., the precessional frequency is 



But we also have 

27t \l\ mv 2 dV 

— = u> = -- and -= —— 

T' mr 1 r dr 

for circular motion. 


(10.203) 

(10.204) 


Thus, we get 


n=4^-?i^i=4n ( io - 2os ) 

z m z c z r dr Z 

Therefore, the combined precession is reduced by a factor of two and we get the 


2m 2 c 2 r dr \ ° p ° 


(10.206) 


The energy levels arising from this correction are called the atomic fine structure. 


10.4. Another Relativity Correction 

The correct relativistic kinetic energy term is 
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where 


JT 0 = — 

2 TO 






Vo 


relativity 


8m 3 c 2 


1 


1 dV 

2?n 2 c 2 r dr 


BqV * do n 


( 10 . 210 ) 

( 10 . 211 ) 

( 10 . 212 ) 


10.5. External Fields - Zeeman and Stark Effects; 
Hyperfine Structure 


10.5.1. Zeeman Effect 

If an external magnetic field exists, then it interacts with the total magnetic 
moment of the electron, where 


Mt.otal - Morbital + Mspin ~ ~~ + g s S ) (10.213) 

2 me 

as we derived earlier. If we define 

ch 

/is = Bohr magneton = — (10.214) 

TOC 

and let B ext = Bz, then we have, using ge = 1 and g s = 2, the result 

Ezeeman = -Mtotal • B ext = ^~{L Z + 2 S z ) (10.215) 

a 

Thus, we must add a term of the form 

HZeeman = ^(4 + 2.S',) (10.216) 

n 

to the Hamiltonian when an external magnetic field is present. 


We can see directly how the orbital angular momentum part of this energy 
arises. We saw earlier that if we had a Hamiltonian 

~ )5 2 

H 0 = + V(f op ) (10.217) 

2m 

and we add an electromagnetic field characterized by a vector potential A, where 
B = V x A, then the momentum operator changes to 

Pem=P--d(f) (10.218) 

c 
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and the Hamiltonian changes to 


Pem.ov \ (Pop c Mr op )) 


3 2 

t T j- em,op -r r /-> \ 

H ‘ + vir ‘ r) ’ 


2 m 


+ V(r op ) 


— Ho ^ (pop * A(vop) + A(vop) ’ Pop) + ^ 2 ^ (10.219) 


The magnetic field has to be enormous or the radial quantum number n very 
large for the A 2 term to have any effect, so we will neglect it for now. Let us 
look at the term 

Pop ' A(vop') + A^Top') ■ p 0 p (10.220) 

For a uniform (constant in magnitude and direction) external field B, we have 


A = —f x B 
2 


( 10 . 221 ) 


I will prove this so we get a chance to see the use of e tJ i- in vector algebra. 


V X A = -1 V X (r X B) = -1 £ £ijk ^-{fxB) l 
z z ijk ux j 

= ~ TT / . &ijk 7^ I / , ^-kmn^mHn I 6-i 

2 (jk 9X J \tn ) 

= ~ ~Z ^ ^ £ijk£mnk J Tv (XmH n ) 

^ ij mn \ k / 


1 d 

— ~ ~ $in$jrn) TT 

^ ij mn OXj 


= --E 

2 4^ 

*? 

= --E 
2 ^ 


a .9 

X e ? ; — — (XjBi) Gi 

dxj dxj 


dxi d Bj dx 0 dBi 

dx-j dx■; dx-j 0 dxj 


ij 


Now 


so we get 


dxi dBj 

= Sij and — = 0 

L/tO q L/U/ o 


( 10 . 222 ) 


^ x A - 2 E 


E ®^i - 3 E 


= J-[B-3B]=B 
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Therefore, we have 


Pop ’ M r op) A( r op) ’ Pop — 2 \p°p ’ X ij*op x B') ' Pop] 

yi &ijkBk \fti%j + &jPi\ ~ ~ ^ ^ijk-^k [2 XjPi ~ iflSij ] 

i j k ij k 

= ^kji&jPiBk — (r op x Pop)' B — L op • (10.223) 

ijk 

which then gives 

2 

^ = ^°^2k~c Lop '^ + 2^ A2 ^ op) (10 ’ 224) 

which accounts for the orbital angular momentum part of the Zeeman energy. 

The spin angular momentum part of the Zeeman energy cannot be derived 
from the non-relativistic Schrodinger equation. When one derives the Dirac 
relativistic equation for the electron, the S op ■ B term appears naturally. 



10.5.2. Stark Effect 


If a hydrogen atom is placed in an external electric field £ which is constant 
is space and time (uniform and static), then an additional energy appears. It 
corresponds to an interaction between and electric dipole made up of the electron 
and proton separated by a distance and the external electric field. We introduce 
the electric dipole moment operator 

d op = ~er op (10.225) 

where f is the position vector of the electron relative to the proton. We then 
write the extra energy term as 

B- dipole = -dop ■ £ (10.226) 

If we choose £ = £z, then we have Hdi po ie - —ez£. The full Hamiltonian is then 


where 


+ H re i a ti v ity + H so + HZeeman -*^c2£poZe 

(10.227) 


(10.228) 

fj _ F op 

velativity g 3 2 

(10.229) 

fj r 1 T 

11 so ~ n 0 0 , &op -*- J op 

. Zm z c z r dr 

(10.230) 

HZeeman = ^ (-^op 2 S Q p) ’ B 

(10.231) 

Bdipole = -erop * & 

(10.232) 


850 



10.5.3. Hyperfine Structure 

The nuclear magnetic dipole moment also generates a magnetic field. If we 
assume that it is a point dipole Mn, then the magnetic field is given by 

where the first two terms are the standard result of the magnetic field due to a 
loop of current as seen from very far away (approximates dipole as a point) and 
the last term is peculiar to a point dipole. The last term will give a contribution 
only for spherically symmetric states (£ = 0). The extra energy is then 

Bhyperfine ~ ~M e • B 

3 (Mat • r) (M e • f) 

^5 

where 

_> Zc -* _> g _► 

M N = g N - - Sn,o P and M e = — S e , op (10.235) 

Zm^c me 

This is clearly due to spin-spin interactions between the electron and the nucleus 
and gives rise to the so-called hyperfine level splitting. 



AIX i '/ 1 ' j - y Mn ■ M e S(f) (10.234) 


Mn \ 87 r 

-r + ~^ M N 6 (r) 


(10.233) 


10.6. Examples 

Now that we have identified all of the relevant corrections to the Hamiltonian 
for atoms, let us illustrate the procedures for calculation of the new energy levels 
via perturbation theory. We look at the simplest atom first. 


10.6.1. Spin-Orbit, Relativity, Zeeman Effect in Hydrogen 
Atom 

The Hamiltonian is 


where 


(10.236) 


Hn = 


p: 


op 


2m 

Hrelativity ~ 

H,n = 


-«■(;) 


op 


Po 


8 to 3 c 2 
1 1 dV 

2m 2 c 2 r dr 


Sop ' Lop 


^Zeeman ~ . (I'op 2 'S 0 p ) * B 


(10.237) 

(10.238) 

(10.239) 

(10.240) 
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The first step is to calculate all relevant commutators so that we can find those 
operators that have a common eigenbasis. 


[j 2 op , H 0 ] = 0 = [j z ,H 0 ] = [L 2 op , Ho] = [S 2 op , H 0 ] (10.241) 

[t%, S 2 op ] = 0 = [L 2 op , J 2 op ] = [S 2 op , J 2 op ] (10.242) 

[J 2 p , J z ] = 0 = [L%, J z ] = [£ 2 p , J„] (10.243) 

This says that there exists a common set of eigenvectors in the unperturbed 
system for the set of operators 


H 0 ,J 2 op ,L 2 op ,S 2 op ,J z (10.244) 

We label these states by the corresponding eigenvalues of the commuting set of 
observables (these are called good quantum numbers) 

| n,£,s,j,irij) (10.245) 

We also have 

[S z , H o ] = 0= [ L Z ,H 0 ] = [tl p , Ho] = [£ 0 2 p , Ho] (10.246) 

[L 2 op , Sip] = 0 = [Lip, 5 2 ] = [S 2 op , 5 2 ] = [Lip, L z ] = [^ 2 p) L z ] (10.247) 

which says that there exists another common set of eigenvectors in the unper¬ 
turbed system for the operators 

H 0 ,Llp,S 2 op ,L z ,S z (10.248) 

We label these states by the corresponding eigenvalues of this commuting set of 
observables (again these are called good quantum numbers) 

\n,£,s,me,m s ) (10.249) 

In this latter basis, the unperturbed or zero-order Hamiltonian has solutions 
represented by 


H 0 \n,£,me,s,m. s ) = \n,£,me,s,m 3 ) , £^ 0) = --x 

2aon z 

h 2 

Ze - nucleus charge (Z - 1 for hydrogen ) , ao = Bohr radius = -- 

me 2 

^ntm eS m s { r ^A) = (^1 (| nlme,sm s )) = (r| Qnlme) \sm s )) 

= (r | nlme) |sm a ) = ( r,9,(j )) | sm s ) 
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and first few unperturbed wave functions are 


V’ioo (r,9,<f>) = — — , 

V 7r \a 0 / 


1 ( Z\ 3/2 _zr 
I — I e a o 


^200 ( r , e ,( j >) = 
4>210 (r,0,<f>) = 


1 


\Z32tt V ao 

1 


\/327r 


_ , 3 ^ 2 / Z?’\ _lL 

— | I 2-} e 2a o 

a 0 / 

_ 3/2 Zr 

— | —e 2a o cos 0 

a 0 


O 0 
© s 


I / 7 \ 3 / 2 7 r Zl . 

i> 2 i±i{r, 0 ,<j>)= /—- ( — ) —e -I5 o sin0e ±l<?i 

\/647r V a 0 / a 0 

We also have the relations below for the unperturbed states. 


L 2 op \n,£,s,j,m.j) = h 2 £(£ + 1) \n,£,s, j,nij) 
So P \n,£,s,j,mj) = h 2 s(s + 1) \n,£,s,j,mj) 

\n, £, s,j, rrij ) = hj(j + 1) |n, £, s,j, rnj) 
J z \n,£,s, j, nij } = hrrrij \ n,£,s,j, rrij) 
Ll p \n,£,s,m e ,m s ) = h 2 £(£ + 1) \n,£,s,m e ,m a ) 
S 2 op \n,£,s,me,m s ) = h 2 s(s+ 1)| n,£,s,me,m s ) 
L z | n,£, s, mt,m s ) = hme \n,£, s, me,m s ) 

S z | n,£, s, me,m s ) = hm s \n,£, s, me, m s ) 

J z | n, £, s, me, m s ) = (. L z + S z ) \n, £, s, me, m s ) 

= h(me + m s ) \n,£,s,me,m s ) 
= hm,j | n, £, s, me,m s ) 


Since the total angular momentum is given by 


Jop — Lop Sop 


(10.250) 

(10.251) 

(10.252) 

(10.253) 


(10.254) 

(10.255) 

(10.256) 

(10.257) 

(10.258) 

(10.259) 

(10.260) 
(10.261) 
(10.262) 

(10.263) 

(10.264) 


(10.265) 


the rules we developed for the addition of angular momentum say that 


j = £+s,£+s- 1,., \£ - s| + 1 ,\£ - s\ (10.266) 


and 

mj = j,j -l,j - 2,. ,-j + l,-j (10.267) 

In the case of hydrogen, where s = 1/2, we have only two allowed total j values 
for each £ value, namely, 

j = £±\ (10.268) 

We can use either of the two sets of basis states (both are an orthonormal basis) 

| n,£,s,j,mj) or \nlmesm s ) (10.269) 
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as the zero-order states for a perturbation theory development of the energies. 
The choice depends on the specific perturbations we are trying to calculate. 


Let us start off by using the \n,£,s,j,m,j) states. 
If we use the potential energy function 


V(r) = ~- 
r 


(10.270) 


for hydrogen, then the spin-orbit correction to the Hamiltonian becomes 




' 1 IdV' 

_ 2 m 2 c 2 r dr _ 


2 m 2 c 2 



(10.271) 


Now J op = L op + S op implies that 

J 2 op = L 2 op + S 2 op + 2L op -S op (10.272) 

- L op -S op =^ [J 2 op - L 2 op - S 2 op ) (10.273) 

and therefore 

Hso = ^ (^) (J 2 op - L 2 op - S 2 op ) (10.274) 

Therefore, 

[ Jop, H so ] = 0 = [J z ,H so \ = [L%, H so \ = [^p, H so ] = [Ho, H so ] (10.275) 

which implies that the state vectors \n,l, s,j, rrij) are also eigenvectors of H so - 
This means that the matrix representation of H so in this basis will be diagonal 
and we can apply standard non-degenerate perturbation theory. 


Applying our rules for first order perturbation theory we have 

Enujm, = E ^ 0) + {n(.sjm.j\H so \n£sjm j ) 

2 *.2 i 

= E n ] + . 2 2 00 + + 1 ) - s(s + 1 )) [nisjrrij \ — \n£sjmj ) 

4m z c z r^ p 

(10.276) 


We now evaluate 

1 

r 3 
op 

Now 


{nisjm,j\ \rdsjrrij) = f d 3 r f d 3 f (ntsjrrij \ r) (r'\ |f) (r \n£sjrrij} 

r ™ r op 

(10.277) 


(r I 4“ l^> = 4 {r I ?) = \s(r' - r) 


(10.278) 
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which gives 

(n£sjmj\ E- | nisjrrij) = J d 3 f E |(r | nisjmj)\ 2 = J d 3 r E |Vw S jm 3 (r)|“ 

Therefore, we can calculate the energy corrections once we know \n,£,s,j,rrij). 

We first consider the trivial case of the n = 1 level in hydrogen. We have 

2 

E[ 0 )=-— (10.279) 

2ao 

and the corresponding states are shown in Table 10.1 below. 


n 

£ 

S 

me 

m s 

j 

TOj 

i 

0 

1/2 

0 

+1/2 

1/2 

+1/2 

i 

0 

1/2 

0 

-1/2 

1/2 

-1/2 


Table 10.1: n = 1 level quantum numbers 


or 



- - i\ 
2 ’ 2’21 


and 


1 ’ 0, 2 ’ 2 ’ 



(10.280) 


where we have added the label jm to distinguish them from the \n,£, s,me,m s ) 
states which we label with mem s = mm. We are able to specify rrif and m s also 
in this case because when £ = 0 we must have me = 0 and j = s which says that 
TTij - m s . 


This is a two-fold degenerate ground state for the atom in zeroth order. 

Since £ = 0, which implies that j = s = 1/2, the expectation value (H so ) = 0. 
Thus, there is no spin orbit correction for this state to first order, In fact, there 
is no spin orbit correction to any order for an £ = 0 state. 


Now in general, we can write 

| n,£,s,j, mj ) = E ^nisjmj m^m s 

me,m s 
m£+m s =rrij 


| n,£,s,me,m s ) 


(10.281) 


where the anesjmjmtms are the relevant Clebsch-Gordon(CG) coefficients. 


For the n = 1 level we have the simple cases where 


11 1\ 

1 , 0 , -,±- 

’ 2 2 2 /. 




i.e., the CG coefficients are equal to 1, 

a i,0,i,i,±i,0,±i 


(10.282) 

(10.283) 
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which is always true for the (maximum j, maximum (minimum) nrij ) states. 
There is always only one such state. 


The next level is the n = 2 level of hydrogen and the complexity of the calculation 
increases fast. We have 

2 

E (o) = _ JL_ (10.284) 

8ao 

It is always the case that the direct-product states \n,£,s,m^,m s ) are easier to 
write down. For this level the \n,£,s,j,mj) states need to be constructed from 
the | n,£,s,m(,m s ) states. Before we proceed, we can enumerate the states in 
both schemes. The degeneracy is given by 

n— 1 n— 1 £+s n—1 

degeneracy = 2 £(2£+l) = 2n 2 = 8= £ E (2j + l) = 2£(2£+l) 
e=o e=o j=\i-s\ e=o 

The states are shown in Tables 10.2 and 10.3 below. 


n 

l 

S 

me 

m s 

ket 

2 

1 

1/2 

1 

+1/2 

|2,l,l/2,l,l/2>—, 

2 

1 

1/2 

0 

+1/2 

|2,1,1/2,0, l/2) mm 

2 

1 

1/2 

-1 

+1/2 

|2,1,1/2, -1, l/2) mm 

2 

1 

1/2 

1 

-1/2 

|2,l,l/2,l,-l/2> mm 

2 

1 

1/2 

0 

-1/2 

|2,1,1/2,0, -l/2) mm 

2 

1 

1/2 

-1 

-1/2 

]2,1,1/2, -1, -l/2) mm 

2 

0 

1/2 

0 

1/2 

|2,0,1/2,0, l/2> mm 

2 

0 

1/2 

0 

-1/2 

|2,0,1/2,0, -l/2) mm 


Table 10.2: n = 2 level quantum numbers ni(Vn s states 


n 


S 

j 

m 3 

ket 

2 

1 

1/2 

3/2 

3/2 

|2,1,1/2,1, l/2) mm 

2 

1 

1/2 

3/2 

1/2 

|2,1,1/2,0, l/2) mm 

2 

1 

1/2 

3/2 

-1/2 

|2,1,1/2, -1, l/2) mm 

2 

1 

1/2 

3/2 

-3/2 

|2,l,l/2,l,-l/2) mm 

2 

1 

1/2 

1/2 

1/2 

|2,1,1/2,0, -l/2) mm 

2 

1 

1/2 

1/2 

-1/2 

|2,1,1/2, -1, -l/2) mm 

2 

0 

1/2 

1/2 

1/2 

|2,0,1/2,0, l/2) mm 

2 

0 

1/2 

1/2 

-1/2 

|2,0,1/2,0, -l/2> mm 


Table 10.3: n - 2 level quantum numbers jrri j states 


In the first set of states, we could have also included the rrij label since we must 
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have rrij = mg + m s . 


In order to learn all the intricate details of this type of calculation, we shall 
proceed in two ways using the spin-orbit correction as an example. 

In method #1, we will construct the \n,£,s,j,m,j) states (the zero-order state 
vectors) from the \n,£,s,mg,m s ) and then calculate the first-order energy cor¬ 
rections. In this basis, the (H so ) matrix will be diagonal. 

In method #2, we will construct the ( H so ) matrix using the easiest states to 
write down, namely the | n,£, s,mg,m s ) states, and then diagonalize it to find 
the correct first order energies and new zero-order state vectors, which should 
be the \n,£, s,j, mj) states. 


Method #1 


We start with the state with maximum j and mj values. This state always has 
a CG coefficient equal to 1, i.e., there is only one way to construct it from the 
other angular momenta. 


„ „ „ 1 3 3 

n = 2,£ = 1, s = -,j = -,mj = - 


jm 


n = 2,£ = l.s = —, mg = 1 ,m s = - ) 

’ ’ 2 ’ 2 lr 


where we have shown all the labels explicitly. From now on we will write such 
equations as 


1 3 31 

2’ 2’ 21 j, 


2,1,1,!,!) 

’ ’2’ 2 lr 


We then use the lowering operators to obtain 




or 


21 I ! I 

’ ’ 2 ’ 2 ’ 2 


2,i,l,-,l 
2 2 2 


„ ,13 31 x /3T3 373 yr 

’ 1} 2’ 2’ 2l jm ~ v 2 (2 + 1 ) ” 2 (2 ~ / 

z z, zr 1 jrn 


= h \/3 

= ft N /i(i + i)-i(i-i) 


jm 

2 ’ 1 ’ 2’ 1 ’ 2 


2,1, !,o,! 
2 2 


+ III 


(H-KH 


2 , 1 ’2’ 1, 2), 


= ft\/2 


2,1, !,o,! 
2 2 


ft\/I 


jm 


2 , 1 , 


!, 0 , !\ 


21 , 


'1 
+ A/ - 




G.i.-b 


V -*-> „ ) - 1 -) 


2 1, 


(10.285) 


Notice that we use the total J operators on the left and the L and S operators 
on the right. 
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The result is a linear combination of the \n,£,s,me,m s ) states all with mrij = 1/2 
as we expected. 


Continuing this process we have 


or 




21 I 3 1\ 
’ ’ 2 ’ 2 ’ 2 / 


= 2 h 


2 , 1 , 


1 


jm 

3 


V§(H4(H 


2 ’ 1 4'5- 


2 ’ 2 ’ 





(10.286) 


and finally 





2,1> 2’ 2 


(10.287) 


We now need to construct the maximum state for then next lowest value of j, 
namely, 


2 , 1 , 


1 1 -\ 

2 ’ 2 ’ 21j 


(10.288) 


This state has rrij = 1/2 so it must be constructed out of the same states that 
make up 


2,1 


1 3 11 

2 ’ 2 ’ 21j 


(10.289) 


or it can be written as 


2 , 1 ,-,-,-) 

2 2 2 /,- 


1 

„ 1\ 

i 

1 

H 

2,1, -. 

, 0, — ) +6 

2.I.-.I.— 

2 

2 / mm 

1 

' 2 

2 / mm 


(10.290) 


Now we must have 


2 1 — — — 
’ ’ 2 ’ 2’2 

2 ’ 2 ’ 2 


1 - - -\ 
’ ’ 2 ’ 2 ’ 2 / 

’ ’ 2 ’ 2 ’ 2 / 


= 0 orthogonality 


3"> 


= 1 normalization 


3"> 


(10.291) 

(10.292) 
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Using the orthonormality of the \n,i, s,me,m s ) states we get 



-6=0 and a 2 


6=1 


(10.293) 


The solution is 



and therefore 


2 ’ 2 ’ 2 




2,1 


Lo,h 


2 /, 




r» > -*-5 


2 1, 


In a similar manner, we find 


2 ’ 2 ’ 2 


jm 

2 , 1 , - 1 , - 

2 2 


2 '4°'4L 


(10.294) 


(10.295) 


(10.296) 


Finally, we construct the other j = 1/2 states with £ = 0. They are 


2 , 0 ,-, 0 , - 

’ ’2 2 

2 , 0 , |, 0 ,- 



2,0,J,0,J) 

" " I mm 

2 , 0 , 

’ 2 ’ 21 r , 


(10.297) 

(10.298) 


We can now calculate the first-order energy corrections. We do not actually need 
the detailed construction of the states to do this, but we will need these states 
to compare with the results of Method #2 later. We found earlier (10.276) that 
in the \n,i, s,j, rrij) basis 


E n es jmj = 4 0) + 6 0'(J + 1) - W + 1) - *(« + 1)) 


4 m 2 c 2 


A-ntsjmj ~ d ^ ^3 \^ntsjmj ( 0 | 


or 


^Bn,£,s,j,mj ~ -^n,£,s,j,m 


E (0) 

J n 4m 2 c 2 


£ j = £+ 1/2 

-{£+!) j=i- 1/2 
0 £ = 0 


r 1 ^2 

-J dr — 1p n) g yS ,j=£±^,m : i(.'r') 


(10.299) 

(10.300) 

(10.301) 

(10.302) 
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Evaluating the integrals we get 


A E. 


n,£,s,j,rrij 


Z 2 \Ei 0) \a 2 
n{ 2i+l){£+l) 
Z 2 \E { n 0) \a 2 


A E. 


n,£,s,j,rrij 


n£(2i + 1) 


A E. 


n,£,s,j,rrij 


= o 


j = t+- 
J 2 


3=1-- 
J 2 


£ = 0 


where 


a = — = fine structure constant 
he 


Therefore, for the n = 2 level we have 


A£ft 




Z 2 \E { 2 0) \a 2 

12 

Z 2 \E { 2 0) \a 2 


AE zfi,hh >mj ~ 0 


j =£+- = - 
J 2 2 


J = 


1 _ 1 
2 “ 2 

j = £+- = - 
2 2 


(10.303) 

(10.304) 

(10.305) 

(10.306) 


(10.307) 

(10.308) 

(10.309) 


We note that for hydrogen Z 2 a 2 « 10 4 and thus, the fine structure splitting is 
significantly smaller than the zero-order energies. 


The relativity correction is the same order of magnitude as the spin-orbit cor¬ 
rection. We found 


H, 


Po 


relativity 


8?n 3 c 2 


(10.310) 


This gives the correction 
A E re i = 


8 m 3 c 2 


f ^ 2 ^Ctm( r )V 4 V , ntm(?’) 


Z 2 |-E^, 0) | a 2 
4n 2 


(-ft) 


(10.311) 


Combining these two correction terms(spin-orbit and relativity) gives 


Z 2 \E. 


( 0)1 


A E 


fine structure 


4 n 2 


(-ft) 


(10.312) 


The result is independent of £. It turns out that this result is valid for i - 0 
also. There is an additional term that must be added to the Hamiltonian which 
contributes only in i - 0 states. It is called the Darwin term. 
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The Darwin term comes from the relativistic equation for the electron and takes 
the form 


Ho„ 


h 2 


8 m 2 c 2 


v y = - 


8 m 2 c 2 


(47T eQ nuclear (r)) = 


(10.313) 


where 


Qnuclear (r) = the nuclear charge density (10.314) 

Because of the delta function, a contribution 

lir \ _irh 2 Ze 2 n ^,2 _ mc 2 Z 4 a A x /lnoltn 

{H Darwin) nj £- 2m 2 c 2 “ 2n 3 < ^’° (10.315) 

arises for £ = 0 states only. This is identical to the contribution (H so + H re i } for 
£ = 0,j = l/2. 


Method #2 

We use the \n,£, s,me,m s ) basis. In this case, the best form of the operators to 
use (means we know how to evaluate them with these states) are 
„2 


Hso 2 m 2 c 2 


(?) 


o 1 S Q p ' L op 


(^)(L z 5 z +i(T + 5_+L_S + )) (10.316) 


2 m 2 c 2 

If we label the rows and columns of the matrix representation by 

3 

„ . to,- = - 

2 / J 2 
" l mm " 


|i> = 
| 2 > = 
|3> = 
|4> = 
|5) = 
| 6 ) = 
|7> = 
| 8 > = 




2,1 ’2’ 0, 2) 

" " I mi 

2,1,0, 

’ ’ 2 ’ 2 /, 


21 , 


2 , 1 , 




2 /,; 


2,0.i,0,-i 


1 

J 2 

1 

rrij = - 


m-i = -- 

3 2 

1 

rrij = — 

j 2 


rrij = — 

j 2 


rrij = - 

3 2 


TO,- = - X 
J 2 


then we get the matrix for (i? so ) as shown in Table 10.4 below. 


I have used a table rather than an equation format so that I could clearly label 
the rows and columns by the state index. 
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1 

2 

3 

4 

5 

6 

7 

8 

1 

a 

0 

0 

0 

0 

0 

0 

0 

2 

0 

b 

c 

0 

0 

0 

0 

0 

3 

0 

C 

d 

0 

0 

0 

0 

0 

4 

0 

0 

0 

e 

f 

0 

0 

0 

5 

0 

0 

0 

f 

g 

0 

0 

0 

6 

0 

0 

0 

0 

0 

h 

0 

0 

7 

0 

0 

0 

0 

0 

0 

p 

0 

8 

0 

0 

0 

0 

0 

0 

0 

q 


Table 10.4: ( H so ) matrix 


We have marked the non-zero elements. Using the operator properties derived 
earlier we get 


0 = (1| H so |1) = (2, 1 ^ 1 ^ 


H„ 




2m' 2 c 2 

1 


2 m 2 c 2 
e 2 

2 m 2 c 2 
e 2 h 2 
4m 2 c 2 


2 , 1 , —, 1 , — \ 

2 2 / 

^ I mm 

(i,s, +1 (i.s- + i-S,)) | 2 ,i, |>i, 

L Z S~ 


2,l,W-\ 
2 ’ 2 / , 


< 2 ’!| ^3 I 2 ’ 1 )— 




2 ’ 1, 2 ’ 1, 2 


e 2 h 2 
4 ?n 2 c 2 


Z 3 

a^n 3 £{£ + + 1) 


f ~Rlt(r)dr ■ 

J r 
o 

Similar calculations give (ff so ) as shown in Table 10.5 below. 


(10.317) 



1 

2 

3 

4 

5 

6 

7 

8 

i 

a 

0 

0 

0 

0 

0 

0 

0 

2 

0 

-a 

\/2a 

0 

0 

0 

0 

0 

3 

0 

\/2a 

0 

0 

0 

0 

0 

0 

4 

0 

0 

0 

0 

\/2a 

0 

0 

0 

5 

0 

0 

0 

\/2a 

-a 

0 

0 

0 

6 

0 

0 

0 

0 

0 

a 

0 

0 

7 

0 

0 

0 

0 

0 

0 

0 

0 

8 

0 

0 

0 

0 

0 

0 

0 

0 


Table 10.5: (i? so ) matrix - revised 
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This says that (only diagonal elements) 


= a = 


= 0 = E ( s 1] 


(10.318) 


and 


| 1 ) = 

| 7 ) = 


2 , 0 ,-.-,- 

2 2 2 


jm 


and |6) = 
and |8) = 


jm 


2 , 0 , 

2 2 2 


jm 


jm 


In order to find E 2 }\ E^\ E^}\ E^}\ corresponding to the new zero-order state 
vectors |2'}, |3'), |4'), |5'}, we must diagonalize the two 2x2 submatrices. 


We begin with the submatrix involving states |2) and |3) as shown in Table 10.6, 
namely, 



2 

3 

2 

-a 

\/2a 

3 

\/2a 

0 


Table 10.6: ( H so ) 2-3 submatrix 


The characteristic equation is 


(-a - E)(-E) - 2a 2 = 0 = E 2 + aE - 2a 2 

(10.319) 

or 

E ( 2 } ] = a and E^P = -2a 

Notice that these energies are 

(10.320) 

E 2 P = ta and E^p = -(£ + l)a 

(10.321) 

as expected for the 

j = £ + - and j = £ - 

J 2 2 

(10.322) 

states respectively. 



We find the eigenvectors using the eigenvalue equations. For |2') we have 
l -a a\J 2 \. / -a a\j2 \ ( u \ (i) ( u \ ( u \ 

[aV2 0 j 2 ” ( <1^2 0 J( » l ' E *d <.)■“(» J 

or 

-u + \Plv = u and \piu - v (10.323) 
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Using the normalization condition u 2 + v 2 = 1 we get 



(10.324) 


or 

(io325) 

and similarly 

(10326) 

We then deal with the submatrix involving states |4) and |5) as shown in Table 
10.7, namely, 



The characteristic equation is 

(-a - E) ( -E ) - 2a 2 = 0 = E 2 + aE - 2a 2 (10.327) 

or 

E { 4 } ] = a and = -2a (10.328) 

and the eigenvectors are 

w-Vlw-Vb-Ki-iL < 10 - 329 ) 

- >/i I 4 * - >/f'*> -1 2 ’ 1 ' 5-5--IU (10 330) 

So including the spin-orbit correction we end up with the energy levels 

E 2 °^ + a for 1, 2', 4', 6 -*• 4 - fold degenerate 
E { ° ] for 7,8 -+ 2 - fold degenerate 
E^ - 2 a for 3', 5 / -»■ 2 - fold degenerate 
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10.6.2. Spin-Orbit and Arbitrary Magnetic Field 

Now let us add on the Zeeman correction (for B - Bz) 

HZeeman = ( L op + ^ op ) ■ B = (L z + 2 S z ) (10.331) 

We can solve this problem for arbitrary magnetic field by repeating Method ^2 
using the correction term as the sum of spin-orbit and Zeeman effects. 

The zero-order Hamiltonian is Hq and the zero-order state vectors are the 
| n,£,s,me,m s ) states. The eight zero-order n- 2 states are all degenerate with 
energy 

E (0) = ~— (10.332) 

2 8a 0 K , 

so we must use degenerate perturbation theory. 

We have already calculated the ( H so ) in this basis. It is shown in Table 10.8 
below. 



1 

2 

3 

4 

5 

6 

7 

8 

1 

a 

0 

0 

0 

0 

0 

0 

0 

2 

0 

-a 

\/2a 

0 

0 

0 

0 

0 

3 

0 

\/2a 

0 

0 

0 

0 

0 

0 

4 

0 

0 

0 

0 

\/2a 

0 

0 

0 

5 

0 

0 

0 

\/2a 

-a 

0 

0 

0 

6 

0 

0 

0 

0 

0 

a 

0 

0 

7 

0 

0 

0 

0 

0 

0 

0 

0 

8 

0 

0 

0 

0 

0 

0 

0 

0 


Table 10.8: ( H so ) matrix 


where 


e 2 h 2 

96to 2 OqC 2 


(10.333) 


The (L z + 2 S z ) matrix is diagonal in this representation and its diagonal el¬ 
ements are given by me + 2 m s and so the Zeeman contribution (HZeeman) is 
shown in Table 10.9 below. 
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1 

2 

3 

4 

5 

6 

7 

8 

1 

2b 

0 

0 

0 

0 

0 

0 

0 

2 

0 

0 

0 

0 

0 

0 

0 

0 

3 

0 

0 

b 

0 

0 

0 

0 

0 

4 

0 

0 

0 

-b 

0 

0 

0 

0 

5 

0 

0 

0 

0 

0 

0 

0 

0 

6 

0 

0 

0 

0 

0 

-2b 

0 

0 

7 

0 

0 

0 

0 

0 

0 

b 

0 

8 

0 

0 

0 

0 

0 

0 

0 

-b 


Table 10.9: ( H Ze eman ) matrix 


where b = The combined perturbation matrix (V") is then given in Table 

10.10 below. 



1 

2 

3 

4 

5 

6 

7 

8 

1 

a+2b 

0 

0 

0 

0 

0 

0 

0 

2 

0 

-a 

a\/2 

0 

0 

0 

0 

0 

3 

0 

a\/2 

b 

0 

0 

0 

0 

0 

4 

0 

0 

0 

-b 

a\/2 

0 

0 

0 

5 

0 

0 

0 

a\/2 

-a 

0 

0 

0 

6 

0 

0 

0 

0 

0 

a-2b 

0 

0 

7 

0 

0 

0 

0 

0 

0 

b 

0 

8 

0 

0 

0 

0 

0 

0 

0 

-b 


Table 10.10: (V) matrix 


After diagonalizing, the new energies are 


E\> 

E‘l' 

E 31 


e 2 a 2 


-+- 1 - 2iibB 

8ag 8 a 0 12 


2 -1 

+ - (-(a - 6) + V9a 2 + 2 ab+ b 2 ) 
8ao 2 
2 1 

+ - (-(a - 6) - \/9a 2 + 2 ab + b 2 ) 
8ao 2 


£4/ =-+ - (-(a + b) + V 9a 2 - 2 ab + b 2 ) 

8ao 2 
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E& =-+ -(-(a + b) - V9a 2 - 2 ab + b 2 ) 

8ao 2 


2 2 
e a 


E & =-1- 2ubB 

8a 0 8a 0 12 P 


-EV = - + HbB 

8ao 

e 2 

= “o-MS-B 

8a 0 

If we let B be small so that b « a, we then get the approximate energies 


e 2 e 2 a 2 

E y =-+-+ 2 h b B 

8 a 0 8 o 0 12 P 

B 2 ' = + ~(-(a - &) + 3a(l + —)) 

8ao 2 9a 

e 2 2 e 2 „ „ - R 

Sclq 3 Sclq 8cio 12 3 

2 i 2 2 2 1 

e 1, e e cr 1 „ 

=-2a h —b = -+ -ubB 

8ao 3 8ao 8ao 6 3 

e 2 2 e 2 e 2 a 2 2 

=-ha— — b — -+- UbB 

8(2o 3 8a-o 8a-o 12 3 

e 5' = — + x(-(a + 6) - 3a(l - —)) 

8a 0 2 9a 


e 1 e e a 1 

= -2 a - b = - UbB 

8ao 3 8ao 8ao 6 3 


e 2 a 2 


E & - - o— + o— To ~ bB 
8(2q 8(2q 12 


Ey = - - + fi B B 

8a 0 


= -- [IbB 

8a 0 


This is clearly a perturbation of the spin-orbit energy levels. We assume that 
the new state vectors become the zero-order vectors in the spin-orbit case for 
low fields. In this case the Zeeman effect corrections (to the fine structure ener¬ 
gies) are given by Table 10.11 below. 
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t 

S 

j 

mg 

^B Z eeman 

State 

1 

1/2 

3/2 

3/2 

2 pbB 

T 

1 

1/2 

3/2 

1/2 

2pbB/3 

2! 

1 

1/2 

1/2 

1/2 

HbB/ 3 

3 ' 

1 

1/2 

3/2 

-1/2 

-2pbB/3 

4' 

1 

1/2 

1/2 

-1/2 

~PbB/ 3 

5' 

1 

1/2 

3/2 

-3/2 

-2 psB 

6' 

0 

1/2 

1/2 

1/2 

PbB 

7 

0 

1/2 

1/2 

-1/2 

~PbB 

8 ' 


Table 10.11: n = 2 energy corrections for small B 


A little bit of study shows the general relation 

AEZeeman = 9 TB BlJlj 


where 


g = Lande g - factor = 1 + 


/(/ + 1 )-l(l +1 ) + s(s + l) 

2j(.? + l) 


This is called the Zeeman effect. 


(10.334) 


(10.335) 


We can prove this result in general. The general method uses the Wigner-Eckart 
Theorem. 


10.6.3. Wigner-Eckart Theorem 

Consider a vector operator A op . We have already shown that the Cartesian 
components of any vector operator has the following commutation relations 
with the Cartesian components of the angular momentum operator 

[Ai, Jj] = ihe ijk A k (10.336) 

We will now prove the following powerful theorem: 

In a basis that diagonalizes and J z (i.e., the |A ,£,s,j,rrij) states, where A 
signifies other operators that commute with J° p and J z ), the matrix elements 
of A op between states with the same j-value are proportional to the matrix 
elements of J op and the proportionality factor is independent of rrij. 

The algebra involved in the proof is simpler if we work in the so-called spherical 
basis instead of the Cartesian basis. The spherical basis uses 

J± ~ Jx i l Jy t Jq — Jz 
a± = A x ± iAy , Aq = A z 
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(10.337) 

(10.338) 




The corresponding commutators are 

[A ± , J 0 ] = ?hA ± , [i ± , J ± ] = 0 (10.339) 

[i ± ,J T ] =±2fti 0 , [i o ,J o ]=0 (10.340) 

[A 0 , J ± ] = ±hA ± (10.341) 

which all follow from the original commutator for the arbitrary vector operator. 
Now, by definition of the operators, we have 

Jo |A ,j,TOj) = hrrij |A, j,rrij) = {j,m,j\ J 0 \j,mj) |A ,j,mj) (10.342) 

J± \\,j,rrij) = h\Jj (j + 1) - mj(mj ± 1) \\,j,rrij ± 1) 

= {j,mj±l\J±\j,mj)\\,j,mj±l} (10.343) 

(A,j,m i | J± = (J T |A,j',TOj)) + 

= (A,j,mj t 1| + !) - t 1) 

= (A ,j,rrij t 1| (j, ?n-y| J± |j,TOj T 1) (10.344) 


We now work with the matrix elements of some of the commutators and use the 
defining relations above to prove the theorem. 

First, we have 

Th(X',j,m' j \A ± \X,j,mj) = (X',j,m' j \[A ± ,J 0 ]\X,j,mj) 

= (A',j,to'| [A ± J 0 - J 0 A±] |A ,j,mj) 

= (jrij,j,m'j\A ± \\,j,rrij) (10.345) 

or 

0 = ( rrij - m'j ± 1 )h (A', j, m'j\ A± |A, j, m.j) (10.346) 

This says that either m) = rrij ± 1 or the matrix element 

(A',j,m'| A ± |A= 0 (10.347) 

Since we have an identical property for the matrix elements of J ± this implies 
that the matrix elements of A± are proportional to those of J ± and we can write 


the proportionality constant as 

(X',j,m j ±l\A ± \\,j,m j ) £ 1Q 348 ^ 

Second, we have 

(A',j,m'| [At, J±] |A ,j,mj) = 0 (10.349) 

(A', j,m'j\A ± J ± |A ,j,mj) = (A',j,m'-| J ± A± |A ,j,mj) (10.350) 

(A ',j, m'j | A ± |A, j, nij ± 1) (A', j, nij ± 1| J± |A, j, m. 0 ) 

= (A \j,m'j t l| A ± |A J,nij) (A, j, nij ± 1| J ± |A, j,m' t l} (10.351) 
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Using the result from the first commutator this says that to' = nij ± 2, which, 
in turn, implies that 

(A',i,TOj ±2\A ± \X,j,m j ± 1) = (A', j,mj ± 1| A ± \X,j,mj) 

( j, m j ± 2| J ± | j, m.j ± 1} (j, m.j ± 11 J± |j, mj} 

This says that the proportionality constant is independent of mrij. 

We define a new symbol for the proportionality constant 

(A', j\ ]^4| |A, j) ± = the reduced matrix element 
which gives the relation 

(A', j, to'-| At |A ,j,rrij) = {X',j\\A\ |A, j> ± (j, to'| J± |j,TOj> 

To complete the proof we need to show that the same result holds for Aq and 
that 

(X',j\\A\\XJ) + = (X',j\\A\\X,j)_ (10.355) 

We have 

± 2h(X',j,m' j \A 0 \X,j,mj) = (A',j,m'| [A ± , J T \ \X,j,mj) 

= (A', j,m'| [A ± X ~ JtA ± ] |A,j, nij) 

= (X',j,m' j \A ± \X,j,mj t 1) (j, nij t 1| X | j,m-j) 

~ {X',j,m'j ± l\A± \X(j,mj\X |j,m' ± l} (10.356) 

Now substituting in the matrix element of A± we get 
±2h(X',j,m' j \A 0 \X,j,m j ) 

= (A', j\ | A\ |A, j) ± [(j, m'l J ± | j,mj t 1) (j,mj t 1| X \j,m,j) 

~ (j,m'j ± l| X\j,rrij) (j,m'| |j, to' ± l}] (10.357) 

This says that Ao has non-vanishing matrix elements only when to' = m.j. We 
then get 

± 2h(X',j,m' j \A 0 \X,j,mj) 

= (A ,i| \A\ |A, j) ± [|0',TOj t 1| | j, to j) | - \(j,mj ± 1| J T |j,?n i )| ] 

= ±2hrrij = ±2h(X',j, m'| Jo |A, j,rrij) (10.358) 

Putting it all together we get 

(A', j, to' | A 0 1 A, j, rrij) = (X',j\\A\\X,j) ± (j,m' j \j 0 \j,m j ) (10.359) 

Since no operator has a ± subscript, this also says that 

(X',j\\A\\X,j) + = (X\j\\A\\X,j)_ = (A',j||A||A,i) (10.360) 


(10.352) 

(10.353) 

(10.354) 
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and we finally have 

(X',j,m'j\A op |A, j, m.j) = (A', j| \A\ |A, j) ± (j,m'| J op \j, rrij) (10.361) 

This completes the proof of the Wigner-Eckart theorem. 


A very important extension of this theorem is the following result: 

(A', j,m’ 3 \A op ■ J op |A ,j,mj) = (X',j\\A\ |A, j) J% p \j,mj) 

= 5m', mj h 2 j(j + 1) (X',j\ \A\ \X,j) (10.362) 

J J 

which says that the scalar product is diagonal in rn 3 . This result follows directly 
from the Wigner-Eckart theorem 

(X r ,j,m,j\A op ■ J op \X,j,mj) = X( A ' , J, 771 j | A oPj kJop,k \X,j,rrij) 

k 

/ \ 


— Z ( ^ 5 JS TYlj | A 0 p : k 
k 


Z m j) 


Jop,k 1^? j-> 'W'j ) 


! 


- Z Z(a' , jf, 772 j | -Aop,fe |A J,m”) (X,j,m"\ J op , k \X,j,rrij) 

k 

’ j 

= Z E (A',4 14 l A , j} (A',j,TO'| J op , k |A, j,m”) (A,j,?n"| J op , k \X,j,vHj) 

m". k 


= (A', j 1141 A, j) X) (A', j, to'| J 0 p,k 



( 

\ 

| Jop,k 

E A, j,m") (A, j, to" 


\ m" 

\ 3 

/ 


Jop,k 1 ^? j-> Wlj ) 


= (A',i| 14 |A, j) (j, to' | Jg P \j,mj) = 5 m '., mj h 2 j(j + 1) (A',j||A||A,j) 

Now back to the Zeeman effect. In the low field limit, we need to evaluate the 
diagonal matrix elements 


(isjrrij\(L~ + 2 S z ) \isjrrij) = ((.sjrrij\ (J z + S z ) \(sjm.j) 

= hrrij + ((sjrrij\ S z \Isjvrij) 

Now the Wigner-Eckart theorem says that 

{lsjmj\S z \(sjrrij) = (lsj\ 14 \(sj) ((sjvij\ J z \(sjrrij) 

= hm.j ((sj \|4 | (sj) 

The scalar product matrix element formula gives 

{(sjrrij | Sop ' Jop \£sjirij) = ((sj 114 \£sj) (jrrij\ J pp \jnij) 


But we also have 


= h 2 j(j + 1) (isj\ |4 \lsj) 


(Jop S op ) - L op - J op + S op 2 S op • Jop 


(10.363) 

(10.364) 

(10.365) 

(10.366) 
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(10.367) 


(Isjmj | Sop ' Jop \lsjm.j) 


\ (lajmj\ (J'op + § 2 op - L 2 op ) l&y'mj) 

^ 2 (i(i + i) + s(s + i)-^+i)) 


or 




j(j + i) + s(s + i) -£(e + i) 

2 j(j + 1) 


(10.368) 


and thus 


(£sjtoj | (L z + 2S Z ) | Isjmj) 

j(j + 1) +s(s + l)-^+l) 


= hrrij + hnij 


2 j (j + 1) 


= hm 


j 9jis 


9jLs - 1 + 


j(j + l)+s(s+ !)-£(£+!) 

2 j(J + 1) 


= Lande g - factor 


(10.369) 

(10.370) 


Finally, we have 


{isjmj | H z eeman \isjmj) = fj, B Bmjgje s 


(10.371) 


and the result we found earlier in the special example case is now proved in 
general. 


10.6.4. Paschen-Bach Effect 

When B is large enough such that A Ezeeman » A E so , but not large enough 
so that the B 2 term we neglected earlier is important, we have the so-called 
Paschen-Bach effect. If the B 2 term is dominant we have the so-called quadratic 
Zeeman effect. 

The best way to see what is happening for all magnetic field values is a plot. In 
CGS Gaussian units 


eV 

g B = 5.7884 x 10" 9 -, « 0 = 5.2918 x 10^ 8 cm , e = 4.80 x 10~ 10 esu 

gauss 

e 2 

— = 27.2 eVa = 1.509 x 10~ 5 eVb = 5.7884 x 10 ~ 9 B eE 
a 0 
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Using our earlier results we then have 


Ey 
Ey 
Ey 
E\ ’ 
E 5 , 
E& 
E r 
E 8 , 


2 2 2 

e e a „ 

8 a 0 8a 0 12 ^ 

e 2 1 ,_ 

-+ -(-(a - &) + \/ 9 a 2 + 2a6 + b 2 ) 

8 ao 2 

e 2 1 ,_ 

--— + -(-(a - 6) - \/ 9 a 2 + 2 ab+ b 2 ) 
oclq 2 

e 2 1 ,_ 

-+ -(-(a + &) + \/ 9 a 2 - 2 ab+ b 2 ) 

8 ao 2 

e 2 1 ,_ 

-+ -(-(a + &) - \/ 9 a 2 - 2 a 6 + 6 2 ) 

8 cio 2 

2 2 2 

e z e z ar ^ 

8a 0 8a 0 12 A 

e 2 

+^bB 

8 ao 


‘e- HbB 

8ao 


A plot of 



looks like Figure 10.2 below. 


xl0 5 eV versus log e (B(gauss)) 



Figure 10.2: Hydrogen Atom In a Magnetic Field - Zeeman Effect 

This plot for fields below 400 gauss ( log e (H) rj 6) shows the characteristic level 
structure of the Zeeman effect. 

The very large magnetic field Paschen-Bach effect is illustrated in Figure 10.3 
below. 
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Figure 10.3: Hydrogen Atom In a Magnetic Field - Paschen-Bach Effect 


Notice the equally-spaced level signature of the Paschen-Bach effect. 

We now define some notation that will be important later as we study atomic 
spectra. For small magnetic fields we found that the approximate state vectors 
are the | nisjrrij) states. The energy levels including spin-orbit effects are 

E n = E .! 0) + A E so 
Ze 2 

2aon 2 

Z 2 a 2 \Ei 0) \ 

+ ^7lyyrij (1 - w - d w) 

We define a spectroscopic notation to label the energy levels using the scheme 
shown below: 

| rdsjnrij) -*■ n 2S+1 L(symbol) j (10.373) 

so that 


21--TO, 
22 J 


2 2 Ps , 


21-TO, 

22 J 


2 2 Pi 


20 -- to , 
22 J 


2 Z Si 


The L(symbols) are defined by Table 10.12 below. 


L 

0 

1 

2 

3 

Symbol 

S 

P 

D 

F 


Table 10.12: Spectroscopic Labels 


The energy level diagram for n = 1 and n = 2 is shown in Figure 10.4 below. 
Earlier we calculated the relativistic correction and found that it was the same 
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Zero-Order 


Spin-Orbit 


n=2 


2 2 P 

2 2 S 


3/2 

1/2 


2 

2 P 


1/2 


1 S 


1/2 


Figure 10.4: Spin-Orbit Energy Levels 

order of magnitude as the spin-orbit correction for hydrogen. We found 

Z 2 a 2 | e ^ 0) | / 2 3 


AE re i = 


\2£+l An) 


n \2£ + 1 Ant 

Combining these two corrections we have 

Z'W\Ei 0) \( 1 3 \ ! 

A Ef s - A E so + A E re i -I -—j- - — I j - £ ± 

J + h 4n J 


(10.374) 


(10.375) 


which is independent of £. This changes the energy level structure to that shown 
in Figure 10.5 below. 


Zero-Order Spin-Orbit Fine Structure 


Figure 10.5: Fine Structure Energy Levels 

The observed spectral lines result from an electron making a transition between 
these levels. We will discuss this topic later. 

10.6.5. Stark Effect 

When a hydrogen atom is placed in an external electric field £o, the potential 
energy of the proton and the electron is given by 

Vdipole(ret ^p) = ~^£o ' ^p ^e 

= e£o(z e - z p ) = eSgz (10.376) 
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where 


Z — Z e Zp — Z re i a ti ve 

(10.377) 

Therefore, we consider the Hamiltonian 


H ~ Ho + Hdipole 

(10.378) 

where 



(10.379) 

Hdipole = c£()Zop 

(10.380) 

For weak electric fields, we can apply perturbation theory (we ignore spin in 
this calculation). First, we apply perturbation theory to the n = 1 ground state 


of the hydrogen atom. 


For the ground state, the wave function is ipioo{?) and the first-order correction 
to the energy is 

e[ 1] = ( 100 | e£oz op 1100 ) 

= e£ 0 J d 3 fd 3 f' (100 | f) (f| z op \f') (f' | 100) 

= e£ 0 J d 3 fd 3 f'zipl 00 (f ) (f | r) Vhoo if') 

= e£ 0 J d 3 fd 3 f , zipl 00 (f)6(f-r)i/; 1 oo(r) 

= e£ 0 J d 3 rz\ip 100 (r)\ 2 (10.381) 

This equals zero since the integrand is the product of an even and odd functions. 
Thus, the first-order correction is zero for the ground state. 


The second-order correction is given by non-degenerate perturbation theory as 



( 2 ) f n f! f \{n£m\ e£oz op 100)| 2 

2^ 2^ 2, (o) ^(0) 

n= 2 £=0 m=-i E-y — Em 

(10.382) 

Using z = r cos 6 we 

have 



( nim\ z op 1100) = 

J d 3 r[R n e(r)Y e * m (e,(/))] [rcos6>] R w (r)Y 00 (Q, 

(10.383) 

Now 

y ” = V4i and2 'V 

Y y 

(10.384) 

Therefore, 




(n£m\ z op 1100) = 

J~ r 3 drR ne (r)R 10 (r)-^= J 

dQY; m (e,<l>)Y lo {0,<l>) 

(10.385) 
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Now 

J dflYl m (9,4>)Y W (9, (/>) = <5<?,i<5 mi o 

(10.386) 

by the orthonormality of the (L^ p ,L z ) eigenfunctions. Therefore, 



OO 

(n(.m\z op \lOQi) =—dt'iSmfi f r 3 drR n i(r)R 10 (r ) 
n/3 4 

(10.387) 

and 

n 1 9 S n 7 (n - 1 l 2 ™ -5 

|(nlO| Zop 1100}| = -- (n ( +1) J +5 ag = /?(n)ag 

(10.388) 

Finally, 

E[ 2) = (efoao) 2 £ = -2F£ 2 a 3 0 

n ~^ 2ao \ n 2 ) 

(10.389) 

where 

.125 

„= 2 (n 2 - 1) 

(10.390) 


Therefore, the ground state exhibits a quadratic Stark effect. 


The n - 2 level, which is the first excited state of hydrogen, has 4 degenerate 
states. 


n = 2-»£=0-*- 4 > 200 = V’l 

f 1 $211 = $2 

= l 0 —*■ $210 = V >3 
[ -1 ^21-1 = $4 


We must use degenerate perturbation theory. We construct the 4x4 (e£oz op ) 
matrix and then diagonalize it. We have 


(c£q z Q p) — c£q 


'(11 Z op 11 ) 

( 2 | z op | 1 ) 
(3| z op |1) 
\( 4 | Zop | 1 ) 


(1| z op |2) 
(2| z op |2> 
(3| z op |2) 
<4| z op |2) 


(1| z op |3) 
(2| z op |3) 
(3| z op |3) 
(4| z op |3) 


(1| z op |4}'' 
(2| z op |4) 
(3| z op |4) 
(4| z op |4}/ 


(10.391) 


Now 2 ; has no <p dependence and therefore, 


(il z op | k) = 0 if rrij * m k 


(10.392) 


Thus, 


(1| z op |2) = 0 = (1| z op |4) 

<2| z op 11 > = 0 = (2| z op |3) = (2| z op |4> 
<3| z op |2) = 0 = (3| z 0 p |4) 

<4| z op |1) = 0 = (4| z op |2) = (4| z op |3> 
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and the matrix becomes 


/<l|3op|l) 0 (1| z op |3> 0 

s _ f. 0 ( 2 | z op | 2 > 0 0 

' 0 op ' 0 <3|z 0 p|l) 0 <3| z op |3) 0 

o 0 0 (4| z op |4> 

We also have 

(1| z op |1) = 0 = (2| z 0 p |2> = (3| z 0 p |3) = (4| Zop |4> (10.394) 

since these integrands involve the product of even and odd functions. 

Finding out which matrix elements are equal to zero without actually evaluating 
the integrals corresponds to finding what are called selection rules. We will 
elaborate on the idea of selection rules in the next section on the Van der Waal’s 
interaction. 


(10.393) 


Thus, the matrix finally becomes (after relabeling the rows and columns) 

0 (1| z 0 p |3) 0 0\ 

\_ c (3| z 0 p | 1 ) 0 0 0 . . 

(ecoZop) — cSq q 0 0 0 (10.39o) 

^ 0 0 0 0 , 

where 

(1| z 0 p |3) = (3| z 0 p |1) = y 1 P 200 (r) z^w (r) d 3 r =-3e£ 0 a 0 (10.396) 

Diagonalizing the 2x2 submatrix gives eigenvalues ±3e£oflo- The first-order 
energies and new zero-order wave functions are 

V> 2 ii(?) -*• -E 2 n = remains degenerate (10.397) 

ip 2 i-i(r) -*■ E 21-1 = E { 2 0) remains degenerate (10.398) 




- 7 = (^ 200 (r) ~ i/> 2 io(r)) E+ = E ^ 0) + 3e£ 0 a 0 
V 2 

- 7 = (^200 (r) + ^210 (r)) -* E- = E ’^ 0) - 3e£ 0 a 0 
v 2 


The degeneracy is broken for the m = 0 levels and we see a linear Stark effect. 
The linear Stark effect only appears for degenerate levels. 


10.6.6. Van der Waal’s Interaction 

We now consider a system consisting of two widely separated atoms. In partic¬ 
ular, we consider the interaction between two hydrogen atoms, where we treat 
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the two protons as fixed point charges separated by a vector R and we define 

?i = vector from first proton to its electron 
f 2 = vector from second proton to its electron 

as shown in Figure 10.5 below. 



Figure 10.6: Van der Waal’s System 
The Hamiltonian is given by 

where 


H = Ho + V 


(10.399) 


Hn = 


->2 9 -2 

Pi,op e 2 P 2 , 


— + 


op 


— = 2 non-interacting hydrogen atoms (10.400) 


and 


2/z r i 2 n r 2 


V = rest of the Coulomb interactions 

= ' / PiP 2 + •'e 1 e 2 + * / eiP 2 + *4 2 Pi 


= e 2 fi + . 


1 


1 


(10.401) 


R |.R + f 2 -fi| |i? + f 2 | \R-fi\J 

This is the perturbation potential. 

We know the zero-order solution for two non-interacting hydrogen atoms. It is 
zero-order states : \n\l\mi) |?i 2 £ 2 m 2 ) (10.402) 

with 

zero-order energies : =-( —^ ^ } (10.403) 

2 a 0 \ nf nf / 


where 


H 0 |ni£i?ni) |n 2 £ 2 ?n 2 ) = \ |n 2 ^ 2 m 2 ) 

2 ao \ nf nf ) 


(10.404) 
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This expression for the perturbation potential is too complicated to calculate. 
We will need to make an approximation. We make the reasonable assumption 
that 

R » r 2 and R » r% (10.405) 

We have two useful mathematical results that we can apply. In general, we can 
write 


l^ + a l [R 2 + 2R ■ a + a 2 ] 1/2 R 
and for small x we have 

[1 + x] 

Using 


, 2 R ■ a a 2 

+ R 2 + R 2 


-1/2 


1 3 

2“ 


1/2 JL ° 2 ° 1 

' » 1 - -x + -x - -x 6 + 


5 

16" 


2 R ■ a a 2 
R 2 + R 2 


we get the general result 


1 _ 1 

|.R + a| R 

1 

“ R 


1 - - 
2 


1/2 R-a a 2 


R 2 R 2 


\ 3(2 R-a a 2 \ 2 

/ + 8 \ R 2 + R 2 j 


a-R 1 a 2 | 3 (a-1?)" 

” ~R2“ ~2~R 2 + 2 IV + 


Therefore, we have 


1 1 r -2 ■ R 1 r 2 3 ('?2 • -R) 

|i? + f 2 | = R~R 3 ~2R3 + 2 R 5 

1 1 fi-1? 1 r? 3(ri-i?)" 


R R 3 2R 3 2 R 5 


(10.406) 

(10.407) 

(10.408) 

(10.409) 

(10.410) 

(10.411) 


1 

R + f 2 - f\ 


1 (r 2 -r 1 )-R l(f 2 -ri) 2 3 ((^2 - h) • R) 

R R 2 2 ^ + 2 ^4 

1 ?2 ■ R ri ■ R 1 r 2 1 fi • f 2 

R~ R 3 + R 3 ~2R 3 ^2R 3 + R 3 

| 3 (?VR)“ | 3(n-^) 2 ((n ■R)(f 2 -R) 

+ 2 R5 + 2 R5 


(10.412) 


Putting all this together we get 



_ , Uh -R)(r 2 -R) 
ri • r 2 - 3-—- 


(10.413) 


880 






Physically, this says that for large separations, the interaction between the atoms 
is the same as that between two dipoles efi and ef 2 separated by R. 


To simplify the algebra, we now choose the vector R to lie along the z-axis 


which gives 


R = Rz 


V = 


e 

R3 

„2 


(X1X2 + 2/12/2 + -1-2) -3 


— (x]_x 2 + 2 / i 2/2 - 2ziz 2 ) 


ZiZ 2 R 2 

R 2 


(10.414) 


(10.415) 


We now specialize to consider the case n\ - n 2 - 2. When n = 2, there are 4 
electron states for each atom 


£ = 0 ,m = 0 
£ = 1 , m = 1 , 0 , -1 


Therefore, there are 16 = (4 x 4) degenerate unperturbed zero-order states, each 
with energy 


22 2 

e e e 

8ao 8 (Iq 4ao 


(10.416) 


We use degenerate perturbation theory. To carry out degenerate perturbation 
theory, we must construct the 16 x 16 matrix representation of (V) and diago¬ 
nalize it to find the energies corrected to first-order. 


The typical matrix element is (leaving off the n labels) 
{£imi£ 2 m 2 \ V \£im 1 £ 2 m 2 ) 

e 2 

= -^3 (/imilii \£\m x ) [t 2 m 2 \x 2 \£ 2 m 2 ) 


+ Til ( 4 mi|yi \£imi) (£ 2 m 2 \y 2 \£ 2 m 2 ) 


e 2 

- 2 — (£imi\zi |£iwi) (£ 2 m 2 | z 2 |£ 2 to 2 ) 

(10.417) 

/2tt 

x = r sinf?cos()> = -rv/ ^ (Wi - Y\ - 1 ) 

(10.418) 

Fhr 

y = r sin 0 sin = +iry — (Tyi + Ty-i) 

(10.419) 

/ 47T 

z = rcos6 = r\ —Y\q 

V 8 

(10.420) 
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and 


(ntm. | x | nt'm') 


pli r 

OO 

r o 

r r 1 

/ T 

J r s R n t(r)R n£ r(r)dr 
.0 

J d<W* m (Y lA - Y\-\)Yv m i 


(ntm\ y \ni'm!) 


= 


/2tt 

OO 

q 

r r 1 

/ T 

J r 3 R n t(r)R n£ r(r)dr 
-0 

J dQY e * m (Yi ! i + Y\^\)Y(f m i 


(ntm. 1 21 ni'm') 


(10.421) 


(10.422) 


(10.423) 


Now let us return to the subject of selection rules. 

We will just begin the discussion here and then elaborate and finish it later 
when we cover the topic of time-dependent perturbation theory. 

Consider the integrals involving the spherical harmonics above. We have 


j 47T 

OO 

r o 

r r 1 

/ T 

.0 

J dnY; m Y 10 Y tmf 


J (MYl m Y lrn ,,Y t , rn , = Ounless 


t +1' + 1 = even 
m= m! + m" 


(10.424) 


These rules follow from doing the integrations over the 9 and p variables. 

In particular, when the perturbation involves x, y, or 2 we have 

for x and y m = m ± 1 
for 2 m — m' 


which is the so-called 

Am = ±1,0 (10.425) 

selection rule for this type of perturbation. 

In addition, we have the 

At = ±1 (10.426) 

selection rule for this type of perturbation. 

These two rules will enable us to say many matrix elements are equal to zero 
by inspection. 

We can derive two more very useful selection rules as follows. We know that 

[Li,rj] = ihSijkTk (10.427) 
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This allows us to write (after much algebra) 


[L z , V] = [(L lz + L 2 z ),V] = [L u , V] + [l 2z , v] 

e 2 - e 2 - 

= -^3 [Lu, (xix 2 + yiy 2 - 2 z!Z 2 )] + [L 2z , {x\x 2 + yiy 2 - 2 z\z 2 )\ 

= 0 


This implies that [l/ z ,I7] = 0 or that the ^-component of the total angular 
momentum of the electrons is not changed by this perturbation (it is conserved). 

This gives the selection rule 


mi + m 2 = m[ + m 2 


(10.428) 


Summarizing the selection rules we have 


li + + 1 = even 

(■2 + ^2 + 1 = euen 
mi - rn'i = ±1,0 
m 2 - to2 = ±1,0 

TO 1 + TO - 2 = TO ^ + TO -2 


( = reason b for a zero) 

( = reason c for a zero) 

( = reason d for a zero) 

( = reason d for a zero) 

( = reason a for a zero) 


and we have also given reason labels for each. 

The unperturbed states are (using the format |£iTOi) \i 2 m 2 ) are 


|1) = |00> 100}, |2} = 100} 111}, |3} = 100} |10}, |4} = |00} |1, -1} 

|5} = |11} |00}, |6} = |11} |11}, |7} = |11} |10}, |8} = |11} |1, -1} 

|9} = |10} |00}, |10} = |10} |11}, |11} = |10} |10}, |12} = |10}|1,-1} 

|13} = |1,-1} |00}, |14} = |1,-1} |11}, |15} = |1,-1)|10}, |16} = |1,-1)|1,-1) 


The (V) matrix looks like (using labels (VALUE) or (Oreason)) and labeling the 
rows/columns in order as 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
There are only 12 nonzero elements(out of 256) and because the matrix is Her- 
mitian we only have 6 to calculate (one side of the diagonal). It should now be 
clear why finding the relevant selection rules is so important!!! 
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Obc 

Oba 

Oa 

Oa 

Oa 

Oa 

Oa 
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Oa 
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Oa 

D 
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Oa 

Oa 
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Oa 

Oa 
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Oa 
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Oa 
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Oa 
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Oa 

Oa 

Oa 

Oa 
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Oa 

Oa 

C 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Obc 

Oa 

Oa 

Obc 
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Oa 
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Oa 

Obc 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Oa 

Obc 


Table 10.13: (V") matrix entries 


The 10 nonzero elements are given by the expressions 


A = C = (200| (200 ( X1 X 2 + ym) |211) |21, -1) 

(10.429) 

e 2 

B = E = - 2 — (2001 (2001 zm 1210) 121,0} 

R 6 

(10.430) 

e 2 

D = — (200 ( 211 | {x lX2 + ym) | 211 } | 200 ) 

(10.431) 

F=^ ( 200 | ( 21 , - 1 | (x lX2 + 2 / 12 / 2 ) | 21 , - 1 ) | 200 ) 

(10.432) 

If we define 

a = V~^f r 3 R.2oR2idr 

V 0 

(10.433) 

we have 


OL 

( 200 | x | 211 ) = - = -( 200 |x| 21 ,-l) 

(10.434) 

CY 

( 200 | y | 211 } = i- = ( 200 | x | 21 , - 1 } 

(10.435) 

/2 

( 200 | 2 | 211 ) = a 

(10.436) 

and 

A = C = — = — = -D = -F = -- a 2 

2 2 2 R 3 

(10.437) 


884 




Now we rearrange the row/column labels(the original choice was arbitrary) to 
create a Jordan canonical form with blocks on the diagonal. We choose 


1 8 11 14 2 5 3 9 4 13 6 7 10 12 15 16 


0 

A 

2A 

A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

2A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

2A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

2A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-A 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 
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0 
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0 

0 

0 
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0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


Table 10.14: Jordan Canonical Form 


The is called the block-diagonalized form. We have one 4x4 and three 2x2 
matrices to diagonalize. We get the eigenvalues 

4 x 4 0,0, ±V6A , 2x2 -* ±A 

2 x 2 — > ±2A , 2 x 2 ±A 

Therefore, the energies correct to first-order are 


Eq + \f&A 

degeneracy 

Eq + 2 A 

degeneracy 

Eo + A 

degeneracy 

Eo 

degeneracy 

Eo-A 

degeneracy 

1 

to 

degeneracy 

Eo-s/QA 

degeneracy 


That was a real problem! 


(10.438) 
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10.7. Variational Methods 


All perturbation methods rely on our ability to make the separation H = Hq + V 
where Hq is solvable exactly and V is a small correction. The Rayleigh-Ritz 
variational method is not subject to any such restrictions. This method is based 
on the following mathematical results. 


We can always write 

H = IH = Y j \N)(N\H = Y t E n \N)(N\ (10.439) 

N N 


where 

H\N) = E n \N) (10.440) 

This is just the spectral decomposition of H in terms of its eigenvectors and 
eigenvalues. Now, if we choose some arbitrary state vector | ip) (called a trial 
vector), then have 

(<P\H\iP) = Y t E n {i/j\N)(N\^) 

N 

> Y, E 0 (i’ I N) (N I V>) = Eo Y d I N) (N I $) 

N N 

> E 0 d >I (e \ N ) (^l) IV’) = E 0 d>\ i W =Eod>\ (10.441) 


or 


\H\ 


d i V ) > 


> Eq 


(10.442) 


for any choice of the trial vector l^), where Eq is the ground state energy (the 
lowest energy). Equality holds only if \i/j) is the true ground state vector. This 
result says that 

^ ^ If ^ is an ripper bound for E 0 (10.443) 

d\W 


Procedure 


1 . 

2 . 


Pick a trial vector | ip) that contains unknown parameters {afe} 


Calculate 


d\Hd) 


E 0 ({afc}) 


(10.444) 


3. Since Eq ({aj,}) is an upper bound, we then minimize it with respect to 
all of the parameters {ak}- This gives a least upper bound for that choice 
of the functional form for the trial vector. 


4. We perform the minimization by setting 

—- = 0 for all k (10.445) 

da k 
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5. The more complex the trial vector, i.e., the more parameters we incor¬ 
porate allows us to better approximate the true functional form of the 
ground state vector and we will get closer and closer to the true energy 
value. 

What about states other than the ground state? If the ground state has different 
symmetry properties than the first excited state, i.e., 

ground state -*■ i - 0 -»■ contains F 0 o 
l si excited state -* £ = 1 -»■ contains Yi m 

then if we choose a trial vector with the symmetry of an i = 0 state we obtain an 
approximation to the ground state energy. If, however, we choose a trial vector 
with the symmetry of an t - 1 state, then we obtain an approximation to the 
first-excited state energy and so on. 

In other words, the variational method always gives the least upper bound for 
the energy of the state with the same symmetry as the trial vector. 

Example 


Let us choose the harmonic oscillator Hamiltonian 

- h 2 d 2 1 2 

2 in dx 2 2 

and a trial wave function 


i>(x,a) = 


[ (a 2 - x 2 ) 2 \x\ < a 

|0 \x\> a 


(10.446) 


(10.447) 


where a is an unknown parameter. The variational principle says that 
(ip(a)\H\ip(a)) . . 

. = E 0 (a) > E 0 = true ground state energy 

WW | ip(a)) 


We get a best 
to a using 


Now we need 
normalization 


value for this choice of trial function by minimizing with respect 


to calculate 
integral) 


dE 0 (a ) 
da 

the integrals. 


= 0 (10.448) 

We have for the denominator (the 


a a 

(ip (a) \ ip (a)) = J ip 2 (x,a)dx= J (a 2 - x 2 ) A dx 

-a -a 



0 


(10.449) 
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and for the numerator 


2 u, 2 ( ^ 

(ip(a)\H\i/j(a)) =-2 f ip(x,a )^—^—— dx+-k2 f ip 2 (x,a)x 2 dx 

2 m J dx 2 2 J 

o o 



= -* (- 8 °a 336 a") 

2mA 21 ) 2 \3465 ) 

(10.450) 

Therefore, 

h 2 

Eo(a) = 1.786 „ + 0.045/ca 2 

ma z 

(10.451) 

The minimum condition gives 



l h 2 \ 1/2 
a 2 = 6.300 ( —- 1 

\ mk j 

(10.452) 

which then 

says that 



0.566ftw > E 0 

(10.453) 


The true value is 0.500 hio. This is an excellent result considering that the trial 
function does not look at all like the correct ground-state wave function (it 
is a Gaussian function). This points out clearly how powerful the variational 
technique can be for many problems. 


10.8. 2 n(i -Order Degenerate Perturbation Theory 

Suppose that the first order correction in perturbation theory is zero for some 
degenerate states so that the states remain degenerate. In this case, second- 
order degenerate perturbation theory must be applied. This is complex. We 
follow the derivation in Schiff (using our notation). 

We assume that 

&m — i Vkm — 0 and Vfcfc = k mm (10.454) 

so that the degeneracy is not removed in first order. 

We assume the equations (to second order) 


H = H 0 

+ V = H 0 +gU 



(10.455) 

| M) = a 

■m | m) + ak | k) + g 

E «! 1> l') + 

9 2 E «! 21 io 

(10.456) 



li=m,k 

ltm,k 


I K) = b, 

■n | m) + bk | k) + g 

E f,* 11 lifts 2 E i>! 2> io 

(10.457) 


ltm,k 

ltm,k 


1 N) = \n) + g Y, a ni 10 

t» 2 E -1? 

|Z) , n + m,k 

(10.458) 


l±m,k 

l±m,k 



Ep — Sp 

+ gE^+g 2 E^ 



(10.459) 

H\M) = 

-- (H 0 + V) | M) = 

E m | M) = (e m 

+ gE^+g 2 E^)\M) 

(10.460) 
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where the degenerate states are labeled by k, in and we assume the degenerate 
zero-order states are linear combinations of the two zero order degenerate states 
as shown. 


A very important point is being made here. 

If the system remains degenerate after first-order correction, then one must 
redevelop the equations for degenerate perturbation theory, using the correct 
zero order state vectors, i.e., linear combinations of the degenerate states. Even 
in the special case where the degenerate basis is uncoupled, i.e., 

(k\V\m) = Vkm = 0 (10.461) 

we must not use non-degenerate perturbation theory, as one might do if many 
textbooks are to be believed. 

Remember that the primary object of degenerate perturbation theory is not 
only to eliminate the divergent terms, but to determine the appropriate linear 
combinations to use for zero-order state vectors. If we start with the wrong 
linear combinations, then we would have a discontinuous change of states in the 
limit of vanishing interactions, which says that the perturbation expansion is 
not valid. 

We then have 

ga m U \m) + ga k U \k) + g E \l) 

ltm,k 

V £ a[ 2 ^|W E a^U\l) 

ltm,k l±m,k 

= {gE^ ) +g 2 E ( £ ) )(a rn \m} + a k \k)) + g E a[ l) £ m \l) 

ltm,k 

£ «! 2| e m |W £ aP’Ef'li) 

ltm,k ltm,k 

Applying the linear functional ( m\ we get 

ga m Umm+g 2 E a \ 1)u rni = gE£ ) a rn +g 2 E < £ ) a m (10.462) 

l±m,k 

Applying the linear functional (k\ we get 

ga k U kk + g 2 E a\ 1 ) U kl =gE£>a k + g 2 Eg>a k (10.463) 

l±m,k 

Applying the linear functional (n\ we get 

QCLmUnm + Q^kUnk ^ 9 ^ + 9 U n l 

ltm,k 

— qc a (1) + o 2 £ a {2) + a 2 E {l) a {2) 
ycm a rn ^ U ^m^rn ^ id a m 
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The first-order terms in (10.462) and (10.463) give the expected result 

E%> = U mm = U kk (10.465) 

The second-order terms in (10.462) and (10.463) give (equation (10.465)) 

£ a ( i 1) U ml = E^a m , £ (10.466) 

l±m,k l±m,k 


The first-order terms in (10.464) give an expression for when n = l + m,k 
(10.467) 


(^m ^l) ~ (ImUlm + &kUl k (10.467) 

Substituting (10.467) into (10.466) we get a pair of homogeneous algebraic equa¬ 
tions for a m and a k - 


These equations have a nonvanishing solution if and only if the determinant of 
the coefficients of a m and a k is zero or 


clet 


UmlUlr, 


-E, 


( 2 ) 


ltm,k 771 

E 

Itm.k 


si 
UklUlm. 

Em-El 


E 

l±m,k 


Uml Ulk 


n.Sl 


E 

l±m,k 


UklUlk 

Em-El 


■E. 


( 2 ) 


= 0 


clet 


l±m,k 


VmlVlm 

Em~ El 


-g 2 E,, 


( 2 ) 


E 

l±m,k 


VklVlr i 


E 

l±m,k 

VkiVi 


VrraVlk 


2 I?( 2 ) 


l±m,k 


■g E, 


= o 


(10.468) 


(10.469) 


The two roots of this equation are g 2 E^ and g 2 E f 2 '* and the two pairs of 
solutions of (10.466) are a m ,a k and b m ,b k . We thus obtain perturbed energy 
levels in which the degeneracy has been removed in second order and we also 
find the correct linear combinations of the unperturbed degenerate state vectors 
| m) and | k). 


Example 


This is a tricky problem because the degeneracy between the first and second 
state is not removed in first order degenerate perturbation theory. 


A system that has three unperturbed states can be represented by the perturbed 
Hamiltonian matrix 


H = H 0 + V 


/ E 1 

0 

0 

\ 

( 0 

0 

a 

\ 

( Ei 

0 

a ' 

° 

Ei 

0 


+ 0 

0 

b 


= 0 

Ei 

b 

l 0 

0 

e 2 


l a* 

b* 

0 

) 

l 

b* 

E 2 ) 


(10.470) 


where £2 > E\. The quantities a and b are to be regarded as perturbations that 
are of same order and are small compared to E 2 - Ei. The procedure is: 


890 



1. Diagonalize the matrix to find the exact eigenvalues. 

2. Use second-order nondegenerate perturbation theory to calculate the per¬ 
turbed energies. Is this procedure correct? 

3. Use second-order degenerate perturbation theory to calculate the per¬ 
turbed energies. 

4. Compare the three results obtained. 

Solution - We have 


H = H 0 + V 



Ei 

0 

0 

\ 


/ 

0 

0 

a 


( Ei 

0 

a \ 


0 

Ei 

0 


+ 


0 

0 

b 


= 0 

Ei 

b 


0 

0 

E2 

1 



a* 

b* 

0 

) 

V a * 

b* 

ts 

to 


with E 2 > Ei and E 2 - E\ » a = b. 

1. For an exact solution we need to find the eigenvalues of 


/ 

Ei 

0 

a 

\ 


0 

Ei 

b 



a* 

b* 

E 2 

) 


This leads to the characteristic equation 

(Ei - E)(Ei - E)(E 2 -E)- (Ei - E) \b\ 2 - (Ei - E) \a\ 2 = 0 


(10.471) 


(10.472) 


This says that one of the eigenvalues is E = Ei and the remaining quadratic 
equation is 


E 2 - (Ei + E 2 )E + (E X E 2 - \b\ 2 - |a| 2 ) = 0 (10.473) 

or the other two eigenvalues are 


E=^ ((Ei +E 2 )±\/ (Ei + E 2 f - 4 (E1E2 ~ \b\ 2 - |a| 2 ) j (10.474) 
The exact energy values are 
Ei 

U(Ei+E 2 ) + \J(Ei + E 2 ) 2 - 4:(EiE 2 - \b \ 2 - |a| 2 )) « Ei + ^ + ^ 
E=U(Ei + E 2 )- \I(Ei + E 2 ) 2 -A(EiE 2 -\b \ 2 -\a\ 2 )) *E 2 - ^ + ^ 

z \ / Jb 1 — ii/2 
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2 . 


Apply non-degenerate second-order perturbation theory. The unperturbed 
system has 


( E i 0 0 > 

H 0 = 0 E 1 0 

\ 0 0 E 2 , 


(10.475) 


Since this is diagonal we have 

e[ 0 ^ = Ei = E , E g 0 - 1 = E 2 (levels 1 and 2 are degenerate) 
and unperturbed eigenvectors 



l 1 \ 

( 0 1 

( 0 

ll> = 

0 

, |2)= 1 

, |3> = I 0 


l 0 ) 

l o j 

\ 1 


The perturbation is (in the unperturbed basis) 


V = 


/ 

0 

0 

a 

\ 


0 

0 

b 


\ 

a* 

b* 

0 

! 


(10.476) 


(10.477) 


Since the diagonal matrix elements of the perturbation are zero we have 
e[^ = E^ = E g 1 ) = 0 or no first - order corrections (10.478) 

Thus, levels 1 and 2 remain degenerate. 


If we formally (and incorrectly) apply non-degenerate second-order per¬ 
turbation theory to this system we get 


r( 2 ) = V _1 Ym 

n L p (o) 

mtn si/ n 


E. 


(o) 


(10.479) 


Now V \2 = 0 ,Vi 3 = o,V 23 = b and so we 



y \Vml\ 2 

e[ 0) - e£ } 

0 f |Vi3l 2 1 \a\ 2 

0 + - E^ 0) E l - E 2 


get 


incorrect because of 0/0 term 



y \Vm2\ 2 
J^2 E { 2 0) - E i 0) 

0 IV23I 2 ? | 5 | 2 

- + — 7 prr—— 777 V = 7 =- incorrect because of 0/0 term 

o E^-E^ Ei-E2 7 
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E? - E 


Iw 


mt 3 E. 


(0) _ F (0) 
j-jn 


E\ 


H 2 + \b\ 2 

(°) _ E (0) ' E (0) _ E (0) e 2 _ El 


iw 


|b> 3 | 


agrees with exact solution 


3. Now we apply second-order degenerate perturbation theory for the two 
degenerate levels. We have 


det 


|Vl 3 


E^-Ei 0 '' 


- E (2) 


Vl3 V32 

(0)_z ? (0) 


Er j -E. 


^23 ^31 

Ei 0) -Ei 0) 


\V 23 r 

Ei 0) -Ei 0 ' > 


-EW 


(10.480) 


= det 


E 1 — E<2 
ba 


-£(2) 


ab 


E 1 —E 2 


E 1 — E 2 

J^l 2 __g(2) 


El—E2 


= 0 


\af\b\ 2 


\a\ 2 \b\ 2 


(e ( 2 )) 2 + \ b \ e {2) 

y ’ Ei - E-2 ' (£i-£ 2 ) 2 (Ei~E 2 ) 2 


= e {2) (e (2) ) 

\ Ei - E 2 J 


= 0 


corresponding to 


E {2) = 0 


E {2) = 

so that to second-order we have 

Ei 

Ei + 

E 2 - 

which agrees with the exact result. 


M +H 

Ei - Eo 


h 2 + h 2 

-Ei — E-i 

h 2 + h 2 

nil - e 2 


(10.481) 


10.9. Problems 

10.9.1. Box with a Sagging Bottom 

Consider a particle in a 1-dimensional box with a sagging bottom given by 

\-Vosin(nx/L) for 0 < x < L 


V ( x) = 


for x < 0 and x > L 
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(a) For small Vq this potential can be considered as a small perturbation of 
an infinite box with a flat bottom, for which we have already solved the 
Schrodinger equation. What is the perturbation potential? 

(b) Calculate the energy shift due to the sagging for the particle in the n th 
stationary state to first order in the perturbation. 


10.9.2. Perturbing the Infinite Square Well 

Calculate the first order energy shift for the first three states of the infinite 
square well in one dimension due to the perturbation 

V(x) = V 0 - 
a 

as shown in Figure 10.7 below. 



Figure 10.7: Ramp perturbation 


10.9.3. Weird Perturbation of an Oscillator 

A particle of mass m moves in one dimension subject to a harmonic oscillator 
potential |?nw 2 a; 2 . The particle is perturbed by an additional weak anharmonic 
force described by the potential AV = A sin kx , A << 1. Find the corrected 
ground state. 

10.9.4. Perturbing the Infinite Square Well Again 

A particle of mass m moves in a one dimensional potential box 


V(x) = 


oo for |x| > 3a 
0 for a < x < 3a 
0 for -3a < x < -a 
Vq for \x\ < a 


as shown in Figure 10.8 below. 

Use first order perturbation theory to calculate the new energy of the ground 
state. 
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-3a -a a 3 a 


Figure 10.8: Square bump perturbation 


10.9.5. Perturbing the 2-dimensional Infinite Square Well 

Consider a particle in a 2-dimensional infinite square well given by 


V(x,y) 


{ 0 for 0 < x < a and 0 < y < a 

oo otherwise 


(a) What are the energy eigenvalues and eigenkets for the three lowest levels? 

(b) We now add a perturbation given by 


Vi (x,y) 


A xy for 0 < x < a and 0 <y < a 
0 otherwise 


Determine the first order energy shifts for the three lowest levels for A « 1. 

(c) Draw an energy diagram with and without the perturbation for the three 
energy states, Make sure to specify which unperturbed state is connected 
to which perturbed state. 


10.9.6. Not So Simple Pendulum 

A mass m is attached by a massless rod of length L to a pivot P and swings 
in a vertical plane under the influence of gravity as shown in Figure 10.9 below. 



Figure 10.9: A quantum pendulum 


(a) In the small angle approximation find the quantum mechanical energy 
levels of the system. 

(b) Find the lowest order correction to the ground state energy resulting from 
the inaccuracy of the small angle approximation. 
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10.9.7. 1-Dimensional Anharmonic Oscillator 

Consider a particle of mass m in a 1-dimensional anharmonic oscillator potential 
with potential energy 

V (x ) = ^raw 2 x 2 + ax 3 + f3x 4 

(a) Calculate the l st -order correction to the energy of the n th perturbed state. 
Write down the energy correct to l s *-order. 

(b) Evaluate all the required matrix elements of x 3 and x 4 needed to determine 
the wave function of the n th state perturbed to l st - or cler. 

10.9.8. A Relativistic Correction for Harmonic Oscillator 

A particle of mass m moves in a 1-dimensional oscillator potential 

V{x) = ^mu 2 x 2 

In the nonrelativistic limit, where the kinetic energy and the momentum are 
related by 



2 m 


the ground state energy is well known to be Eq = hu>/2. 

Relativistically, the kinetic energy and the momentum are related by 
T = E - me 2 = 2 _ mc 2 

(a) Determine the lowest order correction to the kinetic energy (a p 4 term). 

(b) Consider the correction to the kinetic energy as a perturbation and com¬ 
pute the relativistic correction to the ground state energy. 

10.9.9. Degenerate perturbation theory on a spin = 1 system 

Consider the spin Hamiltonian for a system of spin = 1 

H = AS 2 + B(S 2 - S 2 ) , B « A 

This corresponds to a spin = 1 ion located in a crystal with rhombic symmetry. 

(a) Solve the unperturbed problem for Hq = AS 2 . 

(b) Find the perturbed energy levels to first order. 

(c) Solve the problem exactly by diagonalizing the Hamiltonian matrix in 
some basis. Compare to perturbation results. 
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10.9.10. Perturbation Theory in Two-Dimensional Hilbert 
Space 

Consider a spin-1/2 particle in the presence of a static magnetic field along the 
z and x directions, 

B — B z 6 z + B x 6 x 

(a) Show that the Hamiltonian is 

fr , „ Ml „ 

H = hixoCTz + —cr x 

where hujo = hbB z and hfl o = 2 ^bB x . 

(b) If B x - 0, the eigenvectors are \\ z ) and |! z } with eigenvalues ±ftwo res P ec ti ve ly- 
Now turn on a weak x field with B x « B z . Use perturbation theory to 
find the new eigenvectors and eigenvalues to lowest order in B X IB Z . 

(c) If B z = 0, the eigenvectors are |t x ) and ll^.) with eigenvalues respec¬ 
tively. Now turn on a weak z field with B z « B x . Use perturbation theory 
to find the new eigenvectors and eigenvalues to lowest order in B z /B x , 

(d) This problem can actually be solved exactly. Find the eigenvectors and 
eigenvalues for arbitrary values of B z and B x . Show that these agree with 
your results in parts (b) and (c) by taking appropriate limits. 


(e) Plot the energy eigenvalues as a function of B z for fixed B x 
eigenvectors on the curves when B z = 0 and when B z -> ±oo. 


Label the 


10.9.11. Finite Spatial Extent of the Nucleus 

In most discussions of atoms, the nucleus is treated as a positively charged 
point particle. In fact, the nucleus does possess a finite size with a radius given 
approximately by the empirical formula 

R « ro A 1 / 3 

where Tq = 1.2 x 1CT 13 cm (i.e., 1.2 Fermi) and A is the atomic weight or number 
(essentially the number of protons and neutrons in the nucleus). A reasonable 
assumption is to take the total nuclear charge +Ze as being uniformly distributed 
over the entire nuclear volume (assumed to be a sphere). 

(a) Derive the following expression for the electrostatic potential energy of an 
electron in the field of the finite nucleus: 




V(r ) = 


Ze 2 ( r 2 _ 3 \ 
R \ 2 R 2 2 ) 


for r > R 
for r < R 


Draw a graph comparing this potential energy and the point nucleus po¬ 
tential energy. 
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(b) Since you know the solution of the point nucleus problem, choose this as 
the unperturbed Hamiltonian Hq and construct a perturbation Hamilto¬ 
nian H i such that the total Hamiltonian contains the V (r) derived above. 
Write an expression for H\. 

(c) Calculate(remember that R « ao = Bohr radius) the l st - or cler perturbed 
energy for the Is ( n£m ) = (100) state obtaining an expression in terms of Z 
and fundamental constants. How big is this result compared to the ground 
state energy of hydrogen? How does it compare to hyperfine splitting? 

10.9.12. Spin-Oscillator Coupling 

Consider a Hamiltonian describing a spin-1/2 particle in a harmonic well as 
given below: 

H 0 = — eh + hu ( a + a + 1/2)) 

(a) Show that 

{|n)®|f) = |n,|) , |n)® |f) = |n,t)} 

are energy eigenstates with eigenvalues E n ^ = nhui and E n ^ - (n + 1 )hu>, 
respectively. 

(b) The states associated with the ground-state energy and the first excited 
energy level are 

{|0,I>, |U> , |o,t» 

What is(are) the ground state(s)? What is(are) the first excited state(s)? 
Note: two states are degenerate. 

(c) Now consider adding an interaction between the harmonic motion and the 
spin, described by the Hamiltonian 

rr hfl ^ + 

H i = — ( aa + + a <r_) 

so that the total Hamiltonian is now H = Hq + H\. Write a matrix rep¬ 
resentation of H in the subspace of the ground and first excited states in 
the ordered basis given in part (b). 

(d) Find the first order correction to the ground state and excited state energy 
eigenvalues for the subspace above. 

10.9.13. Motion in spin-dependent traps 

Consider an electron moving in one dimension, in a spin-dependent trap as 
shown in Figure 10.10 below: 
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Figure 10.10: A spin-dependent trap 


If the electron is in a spin-up state (with respect to the z-axis), it is trapped in 
the right harmonic oscillator well and if it is in a spin-down state (with respect to 
the z-axis), it is trapped in the left harmonic oscillator well. The Hamiltonian 
that governs its dynamics can be written as: 

H = ^ + JmwL(2 - Az/2) 2 ® Itz) (tzl + \mut 2 osc {z + Az/2) 2 ® |b) (J. z | 

(a) What are the energy levels and stationary states of the system? What are 
the degeneracies of these states? Sketch an energy level diagram for the 
first three levels and label the degeneracies. 

(b) A small, constant transverse field B x is now added with \libB x \ « hut osc . 
Qualitatively sketch how the energy plot in part (a) is modified. 

(c) Now calculate the perturbed energy levels for this system. 

(d) What are the new eigenstates in the ground-state doublet? For Az macro¬ 
scopic, these are sometimes called Schrodinger cat states. Explain why. 


10.9.14. Perturbed Oscillator 

A particle of mass m is moving in the 3-dimensional harmonic oscillator poten¬ 
tial 

V(x, y, z) = ^mw 2 ( x 2 + y 2 + z 2 ) 

A weak perturbation is applied in the form of the function 


AV ( x , y, z) = kxyz + -— x 2 y 2 z 2 
hut 


where k is a small constant. Calculate the shift in the ground state energy to 
second order in k. This is not the same as second-order perturbation theory! 
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10.9.15. Another Perturbed Oscillator 


Consider the system described by the Hamiltonian 

H = P^ + rnoS ^ 

2to 2a V ' 

Assume that a « rnw/ft 

(1) Calculate an approximate value for the ground state energy using first- 
order perturbation theory by perturbing the harmonic oscillator Hamilto¬ 
nian 

H= ^ + rm / x2 
2 m 2 

(2) Calculate an approximate value for the ground state energy using the 
variational method with a trial function t/j = e~^ x I 2 . 


10.9.16. Helium from Hydrogen - 2 Methods 

(a) Using a simple hydrogenic wave function for each electron, calculate by 
perturbation theory the energy in the ground state of the He atom asso¬ 
ciated with the electron-electron Coulomb interaction. Use this result to 
estimate the ionization energy of Helium. 

(b) Calculate the ionization energy by using the variational method with an 
effective charge A in the hydrogenic wave function as the variational pa¬ 
rameter. 


(c) Compare (a) and (b) with the experimental ionization energy 

2 2 
a me 


Eton - 1.807-Eo 


Eq - 


a = fine structure constant 


You will need 

r x 3 


ipi «(?’) = \/—exp(-Ar) , a 0 = -y , [ f d 3 r 1 d 3 r 2 

V 7r me z J J Fi-^l 

That last integral is very hard to evaluate from first principles. 


e -/3(ri+r 2 ) 207T 2 


10.9.17. Hydrogen atom + xy perturbation 

An electron moves in a Coulomb field centered at the origin of coordinates. The 
first excited state (n = 2) is 4-fold degenerate. Consider what happens in the 
presence of a non-central perturbation 

V per t = f(r)xy 

where f(r) is some function only of r, which falls off rapidly as r -* oo. To first 
order, this perturbation splits the 4-fold degenerate level into several distinct 
levels (some might still be degenerate). 
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(a) How many levels are there? 

(b) What is the degeneracy of each? 

(c) Given the energy shift, call it A E, for one of the levels, what are the values 
of the shifts for all the others? 

10.9.18. Rigid rotator in a magnetic field 

Suppose that the Hamiltonian of a rigid rotator in a magnetic field is of the 
form 

H = AL 2 + BL Z + CL y 

Assuming that A, B » C, use perturbation theory to lowest nonvanishing order 
to get approximate energy eigenvalues. 

10.9.19. Another rigid rotator in an electric field 

Consider a rigid body with moment of inertia I, which is constrained to rotate 
in the xj/-plane, and whose Hamiltonian is 

H = —L 2 
2 1 z 

Find the eigenfunctions and eigenvalues (zeroth order solution). Now assume 
the rotator has a fixed dipole moment p in the plane. An electric field £ is 
applied in the plane. Find the change in the energy levels to first and second 
order in the field. 

10.9.20. A Perturbation with 2 Spins 

Let Si and §2 be the spin operators of two spin-1/2 particles. Then S = §1 + §2 
is the spin operator for this two-particle system. 

(a) Consider the Hamiltonian 

H 0 = a(S 2 + S 2 - S 2 )/h 2 
Determine its eigenvalues and eigenvectors. 

(b) Consider the perturbation Hi = X(Si x - S 2 x)- Calculate the new energies 
in first-order perturbation theory. 

10.9.21. Another Perturbation with 2 Spins 

Consider a system with the unperturbed Hamiltonian Ho = -A(S\ Z + S^z) with 
a perturbing Hamiltonian of the form Hi = B(Si x S 2 X - Si y S 2 y )- 

(a) Calculate the eigenvalues and eigenvectors of IHq 
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(b) Calculate the exact eigenvalues of Ho + Hi 

(c) By means of perturbation theory, calculate the first- and the second-order 
shifts of the ground state energy of Ho, as a consequence of the perturba¬ 
tion Hi. Compare these results with those of (b). 

10.9.22. Spherical cavity with electric and magnetic fields 

Consider a spinless particle of mass m and charge e confined in spherical cavity 
of radius R, that is, the potential energy is zero for r < R and infinite for r > R. 

(a) What is the ground state energy of this system? 

(b) Suppose that a weak uniform magnetic field of strength B is switched on. 
Calculate the shift in the ground state energy. 

(c) Suppose that, instead a weak uniform electric field of strength £ is switched 
on. Will the ground state energy increase or decrease? Write down, but 
do not attempt to evaluate, a formula for the shift in the ground state 
energy due to the electric field. 

(d) If, instead, a very strong magnetic field of strength B is turned on, ap¬ 
proximately what would be the ground state energy? 

10.9.23. Hydrogen in electric and magnetic fields 

Consider the n - 2 levels of a hydrogen-like atom. Neglect spins. Calculate to 
lowest order the energy splittings in the presence of both electric and magnetic 
fields B = Be z and £ = £e x . 


10.9.24. n - 3 Stark effect in Hydrogen 

Work out the Stark effect to lowest nonvanishing order for the n = 3 level of the 
hydrogen atom. Obtain the energy shifts and the zeroth order eigenkets. 


10.9.25. Perturbation of the n - 3 level in Hydrogen - Spin- 
Orbit and Magnetic Field corrections 


In this problem we want to calculate the lst-order correction to the n=3 un¬ 
perturbed energy of the hydrogen atom due to spin-orbit interaction and mag¬ 
netic field interaction for arbitrary strength of the magnetic field. We have 
H = Hq + H S o + H m where 




V(r) = 


(') 


H. n = 


1 1 dV(r) 


2m 2 c 2 r dr 


q . t 

u op op 


H m = ^-(L op + 2S op )-B 
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We have two possible choices for basis functions, namely, 

| ntsrrii,m s ) or \ntsjrrij) 

The former are easy to write down as direct-product states 
\nlsmem s ) = R n e(r)Y™ e (9,ip)\s,m s ) 

while the latter must be constructed from these direct-product states using ad¬ 
dition of angular momentum methods. The perturbation matrix is not diagonal 
in either basis. The number of basis states is given by 

n- 1=2 

Y, (2f + 1) x 2 =10 + 6 + 2 = 18 

e=o 

All the 18 states are degenerate in zero-order. This means that we deal with an 
18 x 18 matrix (mostly zeroes) in degenerate perturbation theory. 

Using the direct-product states 

(a) Calculate the nonzero matrix elements of the perturbation and arrange 
them in block-diagonal form. 

(b) Diagonalize the blocks and determine the eigenvalues as functions of B. 

(c) Look at the B -* 0 limit. Identify the spin-orbit levels. Characterize them 
by (isj). 

(d) Look at the large B limit. Identify the Paschen-Bach levels. 

(e) For small B show the Zeeman splittings and identify the Lande ^-factors. 

(f) Plot the eigenvalues versus B. 

10.9.26. Stark Shift in Hydrogen with Fine Structure 

Excluding nuclear spin, the n- 2 spectrum of Hydrogen has the configuration: 






! 

f 


-Pw 


.-Pw 


Figure 10.11: n = 2 Spectrum in Hydrogen 

where A Eps/h = 10 GHz (the fine structure splitting) and AEL arn i,/h = 1 GHz 
(the Lamb shift - an effect of quantum fluctuations of the electromagnetic field). 
These shifts were neglected in the text discussion of the Stark effect. This is 
valid if ea^Ez » A E. Let x = eaoE z . 
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(a) Suppose x < AEL arn b, but x « AEps- ■ Then we need only consider the 
(2si/ 2 , 2pi/ 2 ) subspace in a near degenerate case. Find the new energy 
eigenvectors and eigenvalues to first order. Are they degenerate? For what 
value of the field (in volts/cm) is the level separation doubled over the 
zero field Lamb shift? HINT: Use the representation of the fine structure 
eigenstates in the uncoupled representation. 

(b) Now suppose x > AEps- We must include all states in the near degenerate 
case. Calculate and plot numerically the eigenvalues as a function of x, in 
the range from 0 GHz < x < 10 GHz. 

Comment on the behavior of these curves. Do they have the expected 
asymptotic behavior? Find analytically the eigenvectors in the limit x/AEps 
oo. Show that these are the expected perturbed states. 

10.9.27. 2-Particle Ground State Energy 

Estimate the ground state energy of a system of two interacting particles of 
mass mi and m -2 with the interaction energy 

U(fi - f 2 ) = c(\h - f 2 | 4 ) 


using the variational method. 


10.9.28. Is2s Helium Energies 

Use first-order perturbation theory to estimate the energy difference between 
the singlet and triple states of the (ls2s) configuration of helium. The 2s single 
particle state in helium is 


^(r) = ^(-) 3/2 (2--)e-^ 
\/47r \a 0 / V a 0 / 


10.9.29. Hyperfine Interaction in the Hydrogen Atom 

Consider the interaction 


= 


MbMw Si ■ S 2 
a% h 2 


where [is, Mat are the Bohr magneton and the nuclear magneton, ag is the 
Bohr radius, and Si, S 2 are the proton and electron spin operators. 

(a) Show that H^f splits the ground state into two levels: 


E t = -1 By + ^ 


U s = -1 R y - ^ 


and that the corresponding states are triplets and singlets, respectively. 
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(b) Look up the constants, and obtain the frequency and wavelength of the 
radiation that is emitted or absorbed as the atom jumps between the 
states. The use of hyperfine splitting is a common way to detect hydrogen, 
particularly intergalactic hydrogen. 

10.9.30. Dipole Matrix Elements 

Complete with care; this is real physics. The charge dipole operator for the 
electron in a hydrogen atom is given by 

d(r) = -ef 

Its expectation value in any state vanishes (you should be able to see why 
easily), but its matrix elements between different states are important for many 
applications (transition amplitudes especially). 

(a) Calculate the matrix elements of each of the components between the 
Is ground state and each of the 2p states(there are three of them). By 
making use of the Wigner-Eckart theorem (which you naturally do without 
thinking when doing the integral) the various quantities are reduced to a 
single irreducible matrix element and a very manageable set of Clebsch- 
Gordon coefficients. 

(b) By using actual H-atom wavefunctions (normalized) obtain the magnitude 
of quantities as well as the angular dependence (which at certain points 
at least are encoded in terms of the (£, m) indices). 

(c) Reconstruct the vector matrix elements 

(ls\d\2pj) 

and discuss the angular dependence you find. 

10.9.31. Variational Method 1 

Let us consider the following very simple problem to see how good the variational 
method works. 

(a) Consider the 1-dimensional harmonic oscillator. Use a Gaussian trial 
wave function ip n (x) - e~ ax . Show that the variational approach gives 
the exact ground state energy. 

(b) Suppose for the trial function, we took a Lorentzian 


Using the variational method, by what percentage are you off from the 
exact ground state energy? 
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(c) Now consider the double oscillator with potential 

V(x) = |mw 2 (|x| - a) 2 

as shown below: 



Figure 10.12: Double Oscillator Potential 

Argue that a good choice of trial wave functions are: 

4 > n( x ) = u n( x ~a)± u n (x + a) 

where the u n (x ) are the eigenfunctions for a harmonic potential centered 
at the origin. 

(d) Using this show that the variational estimates of the energies are 

j^± _ A n =t B n 
n= 1 ±C n 

where 

A n = u n (x - a)Hu n (x - a)dx 
B n = J~ u n (x - a)Hu n (x + a)dx 
C n = u n (x + a)Hu n (x - a)dx 

(e) For a much larger than the ground state width, show that 

A E 0 = E { ~ ] - E^ +) * 

V 7i rhoj 

where Vo = mu> 2 a 2 / 2. This is known as the ground tunneling splitting. 
Explain why? 

(f) This approximation clearly breaks down as a ->■ 0. Think about the limits 
and sketch the energy spectrum as a function of a. 
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10.9.32. Variational Method 2 


For a particle in a box that extends from -a to +a, try the trial function (within 
the box) 

ip(x) = (x - a)(x + a) 

and calculate E. There is no parameter to vary, but you still get an upper 
bound. Compare it to the true energy. Convince yourself that the singularities 
in ip" at x = ±a do not contribute to the energy. 


10.9.33. Variational Method 3 

For the attractive delta function potential 

V ( x ) = -aVoS(x) 

use a Gaussian trial function. Calculate the upper bound on Eq and compare 
it to the exact answer -ma 2 VQ /2h 2 . 

10.9.34. Variational Method 4 

For an oscillator choose 

j(x — a) 2 (x + a) 2 \x\ <a 

10 |cc| > a 

calculate E(a), minimize it and compare to huj/2. 

10.9.35. Variation on a linear potential 

Consider the energy levels of the potential V(x) = g\x\. 

(a) By dimensional analysis, reason out the dependence of a general eigenvalue 
on the parameters m = mass, h and g. 

(b) With the simple trial function 

compute (to the bitter end) a variational estimate of the ground state 
energy. Here both c and a are variational parameters. 

(c) Why is the trial function tp(x) = c9(x + a)9{a - x) not a good one? 

(d) Describe briefly (no equations) how you would go about finding a varia¬ 
tional estimate of the energy of the first excited state. 


tp(x) = c9(x + a)9{a - x) 
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10.9.36. Average Perturbation is Zero 

Consider a Hamiltonian 

H.-gUrM 

Hq is perturbed by the spin-orbit interaction for a spin= 1/2 particle, 

H'.p-l 

Show that the average perturbation of all states corresponding to a given term 
(which is characterized by a given L and S) is equal to zero. 

10.9.37. 3-dimensional oscillator and spin interaction 

A spin= 1/2 particle of mass m moves in a spherical harmonic oscillator potential 

U = - mco 2 r 2 
2 

and is subject to the interaction 

V = \a ■ r 

Compute the shift of the ground state energy through second order. 


10.9.38. Interacting with the Surface of Liquid Helium 

An electron at a distance x from a liquid helium surface feels a potential 


V(x) 


-K/x x > 0 
oo x < 0 


where K is a constant. 


In Problem 8.7 we solved for the ground state energy and wave function of this 
system. 

Assume that we now apply an electric field and compute the Stark effect shift 
in the ground state energy to first order in perturbation theory. 


10.9.39. Positronium + Hyperfine Interaction 

Positronium is a hydrogen atom but with a positron as the "nucleus" instead of 
a proton. In the nonrelativistic limit, the energy levels and wave functions are 
the same as for hydrogen, except for scaling due to the change in the reduced 
mass. 
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(a) From your knowledge of the hydrogen atom, write down the normalized 
wave function for the Is ground state of positronium. 

(b) Evaluate the root-mean-square radius for the Is state in units of ao- Is 
this an estimate of the physical diameter or radius of positronium? 

(c) In the s states of positronium there is a contact hyperfine interaction 

i]i 1 — —Me * ) 

where /t e and jl p are the electron and positron magnetic moments and 


M = 


_ge_, s 

2 me 


Using first order perturbation theory compute the energy difference be¬ 
tween the singlet and triplet ground states. Determine which lies lowest. 
Express the energy splitting in GHz. Get a number! 


10.9.40. Two coupled spins 

Two oppositely charged spin-1/2 particles (spins s\=hdi/2 and S2=ha2/2 ) are 
coupled in a system with a spin-spin interaction energy A E. The system is 
placed in a uniform magnetic field B - Bz. The Hamiltonian for the spin 
interaction is 

H = —CTi • 0-2 - (Ml + M2 )-B 

where jlj = gjfioSj/h is the magnetic moment of the j th particle. 

(a) If we define the 2-particle basis-states in terms of the 1-particle states by 

|l> = l + > 1 l + >2 > |2> = l + >l|->2 , |3> = |->ll + >2 - |4) = |->lh>2 

where 


a ix l±)j - l T >j > &ix |±)j - ±* > &iz |±)j - ± |±)j 


and 


V\x&2x |1) - V\x<J2x | + )i |+)2 “ ( a lx \ + )\){v2x |+)2) “ l“)l l“)2 “ K) 
then derive the results below. 

The energy eigenvectors for the 4 states of the system, in terms of the 
eigenvectors of the 2 -component of the operators (j; = 2 Si/h are 

I 1 ') = l+>il+> 2 = I 1 ) . |2 , >=d|-> 1 |+> 2 + c| + > 1 |-> 2 =d|3) + c|2) 

|3 , ) = c|-) 1 |+> 2 -dc|+) 1 |-) 2 =c|3>-d|2> , |4') = |-> 1 |-) 2 = |4) 
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where 

as stated above and 

1 


l±)i - ± l±)i 


d=-=\l- 

n/2 


n/T 


+ X z 




1/2 


^oB(g 2 ~ gi) 
A E 


(b) Find the energy eigenvalues associated with the 4 states. 

(c) Discuss the limiting cases 


g 0 B 


» 1 


g 0 B 


« 1 


A E A E 

Plot the energies as a function of the magnetic field. 


10.9.41. Perturbed Linear Potential 


A particle moving in one-dimension is bound by the potential 


V(x) 


ax x > 0 
oo x < 0 


where a > 0 is a constant. Estimate the ground state energy using first-order 
perturbation theory by the following method: Write V = Vq + V\ where Vq(x) - 
bx 2 , V\ (x) = ax - bx 2 (for x > 0), where b is a constant and treat V) as a 
perturbation. 


10.9.42. The ac-Stark Effect 

Suppose an atom is perturbed by a monochromatic electric filed oscillating at 
frequency u>l, E(t) = E z cosuiLte z (such as from a linearly polarized laser), 
rather than the dc-field studied in the text. We know that such a field can be 
absorbed and cause transitions between the energy levels: we will systematically 
study this effect in Chapter 11. The laser will also cause a shift of energy levels 
of the unperturbed states, known alternatively as the ac-Stark effect, the light 
shift, and sometimes the Lamp shift (don’t you love physics humor). In this 
problem, we will look at this phenomenon in the simplest case that the field 
is near to resonance between the ground state | g) and some excited state |e), 
lol « to eg = ( E e - E g )/h, so that we can ignore all other energy levels in the 
problem (the two-level atom approximation). 

(i) The classical picture. Consider first the Lorentz oscillator model of 
the atom - a charge on a spring - with natural resonance at u o- The 
Hamiltonian for the system is 

» 2 1 _ „ 

H = h— mwgZ 2 - d ■ Eft) 

2m 2 

where d = -ez is the dipole. 
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Figure 10.13: Lorentz Oscillator 


(a) Ignoring damping of the oscillator, use Newton’s Law to show that 
the induced dipole moment is 

d'inducedi.t') ~ OtE^t') — OlE z COS 

where 

e 2 /m -e 2 

Ot = - ft* - 

Wq - ijj\ 2towoA 

is the polarizability with A = u>l - aJo the detuning. 

(b) Use your solution to show that the total energy stored in the system 
is 

H = -^d ind uced(t)E(t) = ~aE 2 (t) 

or the time average value of H is 

H = --aE 2 
4 2 

Note the factor of 1/2 arises because energy is required to create the 
dipole. 

(ii) The quantum picture. We consider the two-level atom described above. 
The Hamiltonian for this system can be written in a time independent form 
(equivalent to the time-averaging done in the classical case). 


// = U a 


+ Hi. 


where H at0 m - -hA \e) (e| is the unperturbed atomic Hamiltonian and 
Hint - -Ap (|e) (< 7 ] + | g) (e|) is the dipole-interaction with hfl = (e\d\g) ■ E. 

(a) Find the exact energy eigenvalues and eigenvectors for this simple 
two dimensional Hilbert space and plot the levels as a function of A. 
These are known as the atomic dressed states. 

(b) Expand your solution in (a) to lowest nonvanishing order in Q to find 
the perturbation to the energy levels. Under what conditions is this 
expansion valid? 






(c) Confirm your answer to (b) using perturbation theory. Find also 
the mean induced dipole moment (to lowest order in perturbation 
theory), and from this show that the atomic polarizability, defined 
by (d) = aE is given by 


a = - 


\(e\d\g )\ 2 

hA 


( 2 ) 

so that the second order perturbation to the ground state is Eg - 
-aEl as in part (b). 

(d) Show that the ratio of the polarizability calculated classically in (b) 
and the quantum expression in (c) has the form 


_ Ct quantum _ | (e| Z |^} | 
classical (Az^) SHO 

where (A z 2 ) SHQ is the SHO zero point variance. This is also known 
as the oscillator strength. 


We see that in lowest order perturbation theory an atomic resonance looks 
just like a harmonic oscillator with a correction factor given by the oscilla¬ 
tor strength and off-resonance harmonic perturbations cause energy level 
shifts as well as absorption and emission(Chapter 11). 


10.9.43. Light-shift for multilevel atoms 

We found the ac-Stark (light shift) for the case of a two-level atom driven by a 
monchromatic field. In this problem we want to look at this phenomenon in a 
more general context, including arbitrary polarization of the electric field and 
atoms with multiple sublevels. 

Consider then a general monochromatic electric field E(x,t ) = 91(£ , (i)e _ * UJl ' t ), 
driving an atom near resonance on the transition | g\J g ) -* \e]J e ), where the 
ground and excited manifolds are each described by some total angular momen¬ 
tum J with degeneracy 2J + 1. The generalization of the ac-Stark shift is now 
the light-shift operator acting on the 2 J g + 1 dimensional ground manifold: 

Vls(x) = ~^E*(x) ■ a-E( x) 


a = —— 

hA 

is the atomic polarizability tensor operator, where d eg = P e dP g is the dipole 
operator, projected between the ground and excited manifolds; the projector 
onto the excited manifold is 

Pe = ^ |e; J e , M e ) (e; J e , M e \ 

M e = -J c 
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and similarly for the ground manifold. 


(a) By expanding the dipole operator in the spherical basis(±,0), show that 
the polarizability operator can be written 


a - a 


E 

q,Mg 


\C 


Mg+q 

Mn 


+ E 

q*q',M g 


&q \dl Jgi Mg) (g , Jg, Mg | Sg 

g +q 

M g +q-q' 


cZ: +q <V I g, J g , M g + q - q') (g, J g , M g \e q 


\ 


/ 


where 

- _ l(e; Je \\d\\g-,Jg )\ 2 

° hA 

and 

C% e g = (J e M e | lqJ g Mg) 

Explain physically, using dipole selection rules, the meaning of the expres¬ 
sion for a. 


(b) Consider a polarized plane wave, with complex amplitude of the form 
E{x) = Ei£Le lk ' x where E\ is the amplitude and the polarization (possi¬ 
bly complex). For an atom driven on the transition | g; J g = 1) -» |e; J e = 2} 
and the cases (i) linear polarization along z, (ii) positive helicity polariza¬ 
tion, (iii) linear polarization along x, find the eigenvalues and eigenvectors 
of the light-shift operator. Express the eigenvalues in units of 

V 1 = ~d|£ 1 | 2 . 

Please comment on what you find for cases (i) and (iii). Repeat for 
| g] J g = 1/2} -» |e; J e = 3/2} and comment. 

(c) A deeper insight into the light-shift potential can be seen by expressing 
the polarizability operator in terms of irreducible tensors. Verify that the 
total light shift is the sum of scalar, vector, and rank-2 irreducible tensor 
interactions, 


V LS = 

where 

and 


~ (|A(5)| 2 a(°) + (E*(x) x E(x) • a« + E*(x) ■ cV 2 ) • E(x)) 


(0) _ ^ge • d eg (J) _ dge x deg 


-3hA ’ 


-2hA 


a(2) - = MA 


d l d J + d J d l 

a ge a ge u ge u ge _ (fi) z 
2 ij 


/ 
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(d) For the particular case of \g: J g = 1/2) -*■ |e; J e = 3/2), show that the rank-2 
tensor part vanishes. Show that the light-shift operator can be written in 
a basis independent form of a scalar interaction (independent of sublevel), 
plus an effective Zeeman interaction for a fictitious B-field interacting with 
the spin-1/2 ground state, 

Vls = V 0 (x)I + Bfict(x) ■ a 


where 


and 


Vo(x) 


-ui\i L m 2 


proportional to field intensity 


B fict («e) 


^l*(x)xs l (x) j 


proportional to field ellipticity 


and we have written E(x) = Ei£l(x). Use this form to explain your 
results from part (b) on the transition | g; J g = 1/2) -» |e; J e = 3/2). 


10.9.44. A Variational Calculation 


Consider the one-dimensional box potential given by 

Jo for |x| < a 

= i r i k 
I oo for \x\ > a 

Use the variational principle with the trial function 

ip(x) = \a\ x - |x| A 

where A is a variational parameter, to estimate the ground state energy. Com¬ 
pare the result with the exact answer. 


10.9.45. Hyperfine Interaction Redux 

An important effect in the study of atomic spectra is the so-called hyperfine 
interaction - the magnetic coupling between the electron spin and the nuclear 
spin. Consider Hydrogen. The hyperfine interaction Hamiltonian has the form 

h H f = g s giHBHN-^s-i 

where s is the electron’s spin-1/2 angular momentum and i is the proton’s 
spin-1/2 angular momentum and the appropriate g-factors and magnetons are 
given. 

(a) In the absence of the hyperfine interaction, but including the electron and 
proton spin in the description, what is the degeneracy of the ground state? 
Write all the quantum numbers associated with the degenerate sublevels. 
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(b) Now include the hyperfine interaction. Let / = i + s be the total spin 
angular momentum. Show that the ground state manifold is described 
with the good quantum numbers \n = 1, l = 0, s = 1/2, i = 1/2, /, to/). What 
are the possible values of / and to/? 

(c) The perturbed Is ground state now has hyperfine splitting. The energy 
level diagram is sketched below. 



t 


1 


Figure 10.14: Hyperfine Splitting 

Label all the quantum numbers for the four sublevels shown in the figure, 

(d) Show that the energy level splitting is 

A Ehf = 

Show numerically that this splitting gives rise to the famous 21 cm radio 
frequency radiation used in astrophysical observations. 


10.9.46. Find a Variational Trial Function 

We would like to find the ground-state wave function of a particle in the potential 
V = 50(e~ x - 1) 2 with to = 1 and h=l. In this case, the true ground state energy 
is known to be Eg = 39/8 = 4.875. Plot the form of the potential. Note that the 
potential is more or less quadratic at the minimum, yet it is skewed. Find a 
variational wave function that comes within 5% of the true energy. OPTIONAL: 
How might you find the exact analytical solution? 

10.9.47. Hydrogen Corrections on 2s and 2p Levels 

Work out the first-order shifts in energies of 2s and 2 p states of the hydrogen 
atom due to relativistic corrections, the spin-orbit interaction and the so-called 
Darwin term, 


P 


+ 9 


1 


8to^c 2 4?n 2 c 2 r dr 


l -^CL-S) + 


8 to 2 c 2 


V 2 W 


V r = 


Ze A 

r 


where you should be able to show that V 2 F C = 47 t 5(r). At the end of the 
calculation, take g = 2 and evaluate the energy shifts numerically. 
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10.9.48. Hyperfine Interaction Again 

Show that the interaction between two magnetic moments is given by the Hamil¬ 
tonian 

H = -j^o(Ai' A 2 )8{x-y) - ( 3 ^y- 

where r,; = ay - j/j. (NOTE: Einstein summation convention used above). Use 
first-order perturbation to calculate the splitting between F - 0,1 levels of 
the hydrogen atoms and the corresponding wavelength of the photon emission. 
How does the splitting compare to the temperature of the cosmic microwave 
background? 

10.9.49. A Perturbation Example 

Suppose we have two spin-1/2 degrees of freedom, A and B. Let the initial 
Hamiltonian for this joint system be given by 

H 0 = (S a ® I B + I A ® S B ) 

where I A and I B are identity operators, S A is the observable for the ^-component 
of the spin for the system A, and S B is the observable for the ^-component of 
the spin for the system B. Here the notation is meant to emphasize that both 
spins experience the same magnetic field B = B z z and have the same gyromag- 
netic ratio 7 . 

(a) Determine the energy eigenvalues and eigenstates for Hq 

(b) Suppose we now add a perturbation term H to tai = Ho + W, where 

W = AS a ■ S B = A ( S A ® S B + S A ® S B + S A ® S B ) 

Compute the first-order corrections to the energy eigenvalues. 

10.9.50. More Perturbation Practice 

Consider two spi-1/2 degrees of freedom, whose joint pure states can be repre¬ 
sented by state vectors in the tensor-product Hilbert space Hab = Ha ®Hb, 
where Ha and Hb are each two-dimensional. Suppose that the initial Hamilto¬ 
nian for the spins is 

Ho = (- 7 A B z S a ) ® I B + I A ® (- 1b B z S b ) 

(a) Compute the eigenstates and eigenenergies of H 0 , assuming 7 a * 7 b and 
that the gyromagnetic ratios are non-zero. If it is obvious to you what the 
eigenstates are, you can just guess them and compute the eigenenergies. 

(b) Compute the first-order corrections to the eigenstates under the pertur¬ 
bation 

W = aS A ® S B 

where a is a small parameter with appropriate units. 
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